The human intestine hosts a complex bacterial community that plays a major role in nutrition and in maintaining human health. A functional metagenomic approach was used to explore the prebiotic breakdown potential of human gut bacteria, including non-cultivated ones. Two metagenomic libraries, constructed from ileum mucosa and fecal microbiota, were screened for hydrolytic activities on the prebiotic carbohydrates inulin, fructo-oligosaccharides, xylo-oligosaccharides, galacto-oligosaccharides and lactulose. The DNA inserts of 17 clones, selected from the 167 hits that were identified, were pyrosequenced in-depth, yielding in total 407, 420 bp of metagenomic DNA. From these sequences, we discovered novel prebiotic degradation pathways containing carbohydrate transporters and hydrolysing enzymes, for which we provided the first experimental proof of function. Twenty of these proteins are encoded by genes that are also present in the gut metagenome of at least 100 subjects, whatever are their ages or their geographical origin. The sequence taxonomic assignment indicated that still unknown bacteria, for which neither culture conditions nor genome sequence are available, possess the enzymatic machinery to hydrolyse the prebiotic carbohydrates tested. The results expand the vision on how prebiotics are metabolized along the intestine, and open new perspectives for the design of functional foods.
Artificial human gut microbial communities implanted into germ-free mice provide insights into how species-level responses to changes in diet give rise to community-level structural and functional reconfiguration and how types of bacteria prioritize use of available nutrients in vivo.
The human gut microbiota is an important metabolic organ, yet little is known about how its individual species interact, establish dominant positions, and respond to changes in environmental factors such as diet. In this study, gnotobiotic mice were colonized with an artificial microbiota comprising 12 sequenced human gut bacterial species and fed oscillating diets of disparate composition. Rapid, reproducible, and reversible changes in the structure of this assemblage were observed. Time-series microbial RNA-Seq analyses revealed staggered functional responses to diet shifts throughout the assemblage that were heavily focused on carbohydrate and amino acid metabolism. High-resolution shotgun metaproteomics confirmed many of these responses at a protein level. One member, Bacteroides cellulosilyticus WH2, proved exceptionally fit regardless of diet. Its genome encoded more carbohydrate active enzymes than any previously sequenced member of the Bacteroidetes. Transcriptional profiling indicated that B. cellulosilyticus WH2 is an adaptive forager that tailors its versatile carbohydrate utilization strategy to available dietary polysaccharides, with a strong emphasis on plant-derived xylans abundant in dietary staples like cereal grains. Two highly expressed, diet-specific polysaccharide utilization loci (PULs) in B. cellulosilyticus WH2 were identified, one with characteristics of xylan utilization systems. Introduction of a B. cellulosilyticus WH2 library comprising >90,000 isogenic transposon mutants into gnotobiotic mice, along with the other artificial community members, confirmed that these loci represent critical diet-specific fitness determinants. Carbohydrates that trigger dramatic increases in expression of these two loci and many of the organism's 111 other predicted PULs were identified by RNA-Seq during in vitro growth on 31 distinct carbohydrate substrates, allowing us to better interpret in vivo RNA-Seq and proteomics data. These results offer insight into how gut microbes adapt to dietary perturbations at both a community level and from the perspective of a well-adapted symbiont with exceptional saccharolytic capabilities, and illustrate the value of artificial communities.
Our intestines are populated by an almost unimaginably large number of microbial cells, most of which are bacteria. This species assemblage operates as a microbial metabolic organ, performing myriad tasks that contribute to our well-being, including processing components of our diet. The way this incredible machine assembles itself and operates remains mysterious. One approach to understanding its properties is to create artificial communities composed of a limited number of sequenced human gut bacterial species and to install them in the guts of germ-free mice that are then fed different diets. In this report, we adopt this approach. We describe the genome sequence of a new gut bacterial isolate, Bacteroides cellulosilyticus WH2, which is equipped with an unprecedented number of carbohydrate active enzymes. Deploying four different “omics” technologies, we characterize the response to diet, the relative stability, and the temporal dynamics of a 12-species artificial bacterial assemblage (including B. cellulosilyticus WH2) implanted in germ-free mouse guts. We also combine high-throughput substrate utilization screens and RNA-Seq to generate reference data analogous to a “Rosetta stone” in order to decipher what types of carbohydrates B. cellulosilyticus encounters and uses within the gut, and how it interacts with other organisms that have similar and/or distinct “professions.” This work sets the stage for future ecological and metabolic studies of more complex assemblages that more fully emulate the properties of our native gut communities.
The genome of the coprophilic ascomycete Podospora anserina encodes 33 different genes encoding copper-dependent lytic polysaccharide monooxygenases (LPMOs) from glycoside hydrolase family 61 (GH61). In this study, two of these enzymes (P. anserina GH61A [PaGH61A] and PaGH61B), which both harbored a family 1 carbohydrate binding module, were successfully produced in Pichia pastoris. Synergistic cooperation between PaGH61A or PaGH61B with the cellobiose dehydrogenase (CDH) of Pycnoporus cinnabarinus on cellulose resulted in the formation of oxidized and nonoxidized cello-oligosaccharides. A striking difference between PaGH61A and PaGH61B was observed through the identification of the products, among which were doubly and triply oxidized cellodextrins, which were released only by the combination of PaGH61B with CDH. The mass spectrometry fragmentation patterns of these oxidized products could be consistent with oxidation at the C-6 position with a geminal diol group. The different properties of PaGH61A and PaGH61B and their effect on the interaction with CDH are discussed in regard to the proposed in vivo function of the CDH/GH61 enzyme system in oxidative cellulose hydrolysis.
The metagenomic analysis of gut microbiomes has emerged as a powerful strategy for the identification of biomass-degrading enzymes, which will be no doubt useful for the development of advanced biorefining processes. In the present study, we have performed a functional metagenomic analysis on comb and gut microbiomes associated with the fungus-growing termite, Pseudacanthotermes militaris.
Using whole termite abdomens and fungal-comb material respectively, two fosmid-based metagenomic libraries were created and screened for the presence of xylan-degrading enzymes. This revealed 101 positive clones, corresponding to an extremely high global hit rate of 0.49%. Many clones displayed either β-d-xylosidase (EC 220.127.116.11) or α-l-arabinofuranosidase (EC 18.104.22.168) activity, while others displayed the ability to degrade AZCL-xylan or AZCL-β-(1,3)-β-(1,4)-glucan. Using secondary screening it was possible to pinpoint clones of interest that were used to prepare fosmid DNA. Sequencing of fosmid DNA generated 1.46 Mbp of sequence data, and bioinformatics analysis revealed 63 sequences encoding putative carbohydrate-active enzymes, with many of these forming parts of sequence clusters, probably having carbohydrate degradation and metabolic functions. Taxonomic assignment of the different sequences revealed that Firmicutes and Bacteroidetes were predominant phyla in the gut sample, while microbial diversity in the comb sample resembled that of typical soil samples. Cloning and expression in E. coli of six enzyme candidates identified in the libraries provided access to individual enzyme activities, which all proved to be coherent with the primary and secondary functional screens.
This study shows that the gut microbiome of P. militaris possesses the potential to degrade biomass components, such as arabinoxylans and arabinans. Moreover, the data presented suggests that prokaryotic microorganisms present in the comb could also play a part in the degradation of biomass within the termite mound, although further investigation will be needed to clarify the complex synergies that might exist between the different microbiomes that constitute the termitosphere of fungus-growing termites. This study exemplifies the power of functional metagenomics for the discovery of biomass-active enzymes and has provided a collection of potentially interesting biocatalysts for further study.
Functional metagenomics; Fungus-growing termite; Glycoside hydrolases; Hemicellulases; Biomass degradation; Biorefinery
The microbial production of ethanol from lignocellulosic biomass is a multi-component process that involves biomass hydrolysis, carbohydrate transport and utilization, and finally, the production of ethanol. Strains of the genus Thermoanaerobacter have been studied for decades due to their innate abilities to produce comparatively high ethanol yields from hemicellulose constituent sugars. However, their inability to hydrolyze cellulose, limits their usefulness in lignocellulosic biofuel production. As such, co-culturing Thermoanaerobacter spp. with cellulolytic organisms is a plausible approach to improving lignocellulose conversion efficiencies and yields of biofuels. To evaluate native lignocellulosic ethanol production capacities relative to competing fermentative end-products, comparative genomic analysis of 11 sequenced Thermoanaerobacter strains, including a de novo genome, Thermoanaerobacter thermohydrosulfuricus WC1, was conducted. Analysis was specifically focused on the genomic potential for each strain to address all aspects of ethanol production mentioned through a consolidated bioprocessing approach. Whole genome functional annotation analysis identified three distinct clades within the genus. The genomes of Clade 1 strains encode the fewest extracellular carbohydrate active enzymes and also show the least diversity in terms of lignocellulose relevant carbohydrate utilization pathways. However, these same strains reportedly are capable of directing a higher proportion of their total carbon flux towards ethanol, rather than non-biofuel end-products, than other Thermoanaerobacter strains. Strains in Clade 2 show the greatest diversity in terms of lignocellulose hydrolysis and utilization, but proportionately produce more non-ethanol end-products than Clade 1 strains. Strains in Clade 3, in which T. thermohydrosulfuricus WC1 is included, show mid-range potential for lignocellulose hydrolysis and utilization, but also exhibit extensive divergence from both Clade 1 and Clade 2 strains in terms of cellular energetics. The potential implications regarding strain selection and suitability for industrial ethanol production through a consolidated bioprocessing co-culturing approach are examined throughout the manuscript.
Since its inception, the carbohydrate-active enzymes database (CAZy; http://www.cazy.org) has described the families of enzymes that cleave or build complex carbohydrates, namely the glycoside hydrolases (GH), the polysaccharide lyases (PL), the carbohydrate esterases (CE), the glycosyltransferases (GT) and their appended non-catalytic carbohydrate-binding modules (CBM). The recent discovery that members of families CBM33 and family GH61 are in fact lytic polysaccharide monooxygenases (LPMO), demands a reclassification of these families into a suitable category.
Because lignin is invariably found together with polysaccharides in the plant cell wall and because lignin fragments are likely to act in concert with (LPMO), we have decided to join the families of lignin degradation enzymes to the LPMO families and launch a new CAZy class that we name “Auxiliary Activities” in order to accommodate a range of enzyme mechanisms and substrates related to lignocellulose conversion. Comparative analyses of these auxiliary activities in 41 fungal genomes reveal a pertinent division of several fungal groups and subgroups combining their phylogenetic origin and their nutritional mode (white vs. brown rot).
The new class introduced in the CAZy database extends the traditional CAZy families, and provides a better coverage of the full extent of the lignocellulose breakdown machinery.
CAZy database; Evolution of lignocellulose breakdown; Ligninolytic enzymes; Lytic polysaccharide monooxygenases
Pyrenophora tritici-repentis is a necrotrophic fungus causal to the disease tan spot of wheat, whose contribution to crop loss has increased significantly during the last few decades. Pathogenicity by this fungus is attributed to the production of host-selective toxins (HST), which are recognized by their host in a genotype-specific manner. To better understand the mechanisms that have led to the increase in disease incidence related to this pathogen, we sequenced the genomes of three P. tritici-repentis isolates. A pathogenic isolate that produces two known HSTs was used to assemble a reference nuclear genome of approximately 40 Mb composed of 11 chromosomes that encode 12,141 predicted genes. Comparison of the reference genome with those of a pathogenic isolate that produces a third HST, and a nonpathogenic isolate, showed the nonpathogen genome to be more diverged than those of the two pathogens. Examination of gene-coding regions has provided candidate pathogen-specific proteins and revealed gene families that may play a role in a necrotrophic lifestyle. Analysis of transposable elements suggests that their presence in the genome of pathogenic isolates contributes to the creation of novel genes, effector diversification, possible horizontal gene transfer events, identified copy number variation, and the first example of transduplication by DNA transposable elements in fungi. Overall, comparative analysis of these genomes provides evidence that pathogenicity in this species arose through an influx of transposable elements, which created a genetically flexible landscape that can easily respond to environmental changes.
wheat (Triticum aestivum); copy number variation; histone H3 transduplication; ToxA; ToxB; anastomosis
To derive post-genomic, neutral insight into the peptidoglycan (PG) distribution among organisms, we mined 1,644 genomes listed in the Carbohydrate-Active Enzymes database for the presence of a minimal 3-gene set that is necessary for PG metabolism. This gene set consists of one gene from the glycosyltransferase family GT28, one from family GT51 and at least one gene belonging to one of five glycoside hydrolase families (GH23, GH73, GH102, GH103 and GH104).
None of the 103 Viruses or 101 Archaea examined possessed the minimal 3-gene set, but this set was detected in 1/42 of the Eukarya members (Micromonas sp., coding for GT28, GT51 and GH103) and in 1,260/1,398 (90.1%) of Bacteria, with a 100% positive predictive value for the presence of PG. Pearson correlation test showed that GT51 family genes were significantly associated with PG with a value of 0.963 and a p value less than 10-3. This result was confirmed by a phylogenetic comparative analysis showing that the GT51-encoding gene was significantly associated with PG with a Pagel’s score of 60 and 51 (percentage of error close to 0%). Phylogenetic analysis indicated that the GT51 gene history comprised eight loss and one gain events, and suggested a dynamic on-going process.
Genome analysis is a neutral approach to explore prospectively the presence of PG in uncultured, sequenced organisms with high predictive values.
Peptidoglycan; Genome; Glycoside hydrolase; Glycosyltransferase; Gram; Beta-lactamines; Glycopeptides
The class Dothideomycetes is one of the largest groups of fungi with a high level of ecological diversity including many plant pathogens infecting a broad range of hosts. Here, we compare genome features of 18 members of this class, including 6 necrotrophs, 9 (hemi)biotrophs and 3 saprotrophs, to analyze genome structure, evolution, and the diverse strategies of pathogenesis. The Dothideomycetes most likely evolved from a common ancestor more than 280 million years ago. The 18 genome sequences differ dramatically in size due to variation in repetitive content, but show much less variation in number of (core) genes. Gene order appears to have been rearranged mostly within chromosomal boundaries by multiple inversions, in extant genomes frequently demarcated by adjacent simple repeats. Several Dothideomycetes contain one or more gene-poor, transposable element (TE)-rich putatively dispensable chromosomes of unknown function. The 18 Dothideomycetes offer an extensive catalogue of genes involved in cellulose degradation, proteolysis, secondary metabolism, and cysteine-rich small secreted proteins. Ancestors of the two major orders of plant pathogens in the Dothideomycetes, the Capnodiales and Pleosporales, may have had different modes of pathogenesis, with the former having fewer of these genes than the latter. Many of these genes are enriched in proximity to transposable elements, suggesting faster evolution because of the effects of repeat induced point (RIP) mutations. A syntenic block of genes, including oxidoreductases, is conserved in most Dothideomycetes and upregulated during infection in L. maculans, suggesting a possible function in response to oxidative stress.
Dothideomycetes is the largest and most ecologically diverse class of fungi that includes many plant pathogens with high economic impact. Currently 18 genome sequences of Dothideomycetes are available, 14 of which are newly described in this paper and in several companion papers, allowing unprecedented resolution in comparative analyses. These 18 organisms have diverse lifestyles and strategies of plant pathogenesis. Three feed on dead organic matter only, six are necrotrophs (killing the host plant cells), one is a biotroph (forming an association with and thus feeding on the living cells of the host plant cells) and 8 are hemibiotrophs (having an initial biotrophic stage, and killing the host plant at a later stage). These various lifestyles are also reflected in the gene sets present in each group. For example, sets of genes involved in carbohydrate degradation and secondary metabolism are expanded in necrotrophs. Many genes involved in pathogenesis are located near repetitive sequences, which are believed to speed up their evolution. Blocks of genes with conserved gene order were identified. In addition to this we deduce that the mechanism for mesosynteny, a type of genome evolution particular to Dothideomycetes, is by intra-chromosomal inversions.
We sequenced and compared the genomes of the Dothideomycete fungal plant pathogens Cladosporium fulvum (Cfu) (syn. Passalora fulva) and Dothistroma septosporum (Dse) that are closely related phylogenetically, but have different lifestyles and hosts. Although both fungi grow extracellularly in close contact with host mesophyll cells, Cfu is a biotroph infecting tomato, while Dse is a hemibiotroph infecting pine. The genomes of these fungi have a similar set of genes (70% of gene content in both genomes are homologs), but differ significantly in size (Cfu >61.1-Mb; Dse 31.2-Mb), which is mainly due to the difference in repeat content (47.2% in Cfu versus 3.2% in Dse). Recent adaptation to different lifestyles and hosts is suggested by diverged sets of genes. Cfu contains an α-tomatinase gene that we predict might be required for detoxification of tomatine, while this gene is absent in Dse. Many genes encoding secreted proteins are unique to each species and the repeat-rich areas in Cfu are enriched for these species-specific genes. In contrast, conserved genes suggest common host ancestry. Homologs of Cfu effector genes, including Ecp2 and Avr4, are present in Dse and induce a Cf-Ecp2- and Cf-4-mediated hypersensitive response, respectively. Strikingly, genes involved in production of the toxin dothistromin, a likely virulence factor for Dse, are conserved in Cfu, but their expression differs markedly with essentially no expression by Cfu in planta. Likewise, Cfu has a carbohydrate-degrading enzyme catalog that is more similar to that of necrotrophs or hemibiotrophs and a larger pectinolytic gene arsenal than Dse, but many of these genes are not expressed in planta or are pseudogenized. Overall, comparison of their genomes suggests that these closely related plant pathogens had a common ancestral host but since adapted to different hosts and lifestyles by a combination of differentiated gene content, pseudogenization, and gene regulation.
We compared the genomes of two closely related pathogens with very different lifestyles and hosts: C. fulvum (Cfu), a biotroph of tomato, and D. septosporum (Dse), a hemibiotroph of pine. Some differences in gene content were identified that can be directly related to their different hosts, such as the presence of a gene involved in degradation of a tomato saponin only in Cfu. However, in general the two species share a surprisingly large proportion of genes. Dse has functional homologs of Cfu effector genes, while Cfu has genes for biosynthesis of dothistromin, a toxin probably associated with virulence in Dse. Cfu also has an unexpectedly large content of genes for biosynthesis of other secondary metabolites and degradation of plant cell walls compared to Dse, contrasting with its host preference and lifestyle. However, many of these genes were not expressed in planta or were pseudogenized. These results suggest that evolving species may retain genetic signatures of the host and lifestyle preferences of their ancestor and that evolution of new genes, gene regulation, and pseudogenization are important factors in adaptation.
Crohn's disease (CD) is an inflammatory bowel disease of complex etiology, although dysbiosis of the gut microbiota has been implicated in chronic immune-mediated inflammation associated with CD. Here we combined shotgun metagenomic and metaproteomic approaches to identify potential functional signatures of CD in stool samples from six twin pairs that were either healthy, or that had CD in the ileum (ICD) or colon (CCD). Integration of these omics approaches revealed several genes, proteins, and pathways that primarily differentiated ICD from healthy subjects, including depletion of many proteins in ICD. In addition, the ICD phenotype was associated with alterations in bacterial carbohydrate metabolism, bacterial-host interactions, as well as human host-secreted enzymes. This eco-systems biology approach underscores the link between the gut microbiota and functional alterations in the pathophysiology of Crohn's disease and aids in identification of novel diagnostic targets and disease specific biomarkers.
Bifidobacteria are known as anaerobic/microaerophilic and fermentative microorganisms, which commonly inhabit the gastrointestinal tract of various animals and insects. Analysis of the 2,167,301 bp genome of Bifidobacterium asteroides PRL2011, a strain isolated from the hindgut of Apis mellifera var. ligustica, commonly known as the honey bee, revealed its predicted capability for respiratory metabolism. Conservation of the latter gene clusters in various B. asteroides strains enforces the notion that respiration is a common metabolic feature of this ancient bifidobacterial species, which has been lost in currently known mammal-derived Bifidobacterium species. In fact, phylogenomic based analyses suggested an ancient origin of B. asteroides and indicates it as an ancestor of the genus Bifidobacterium. Furthermore, the B. asteroides PRL2011 genome encodes various enzymes for coping with toxic products that arise as a result of oxygen-mediated respiration.
The large Glycoside Hydrolase family 5 (GH5) groups together a wide range of enzymes acting on β-linked oligo- and polysaccharides, and glycoconjugates from a large spectrum of organisms. The long and complex evolution of this family of enzymes and its broad sequence diversity limits functional prediction. With the objective of improving the differentiation of enzyme specificities in a knowledge-based context, and to obtain new evolutionary insights, we present here a new, robust subfamily classification of family GH5.
About 80% of the current sequences were assigned into 51 subfamilies in a global analysis of all publicly available GH5 sequences and associated biochemical data. Examination of subfamilies with catalytically-active members revealed that one third are monospecific (containing a single enzyme activity), although new functions may be discovered with biochemical characterization in the future. Furthermore, twenty subfamilies presently have no characterization whatsoever and many others have only limited structural and biochemical data. Mapping of functional knowledge onto the GH5 phylogenetic tree revealed that the sequence space of this historical and industrially important family is far from well dispersed, highlighting targets in need of further study. The analysis also uncovered a number of GH5 proteins which have lost their catalytic machinery, indicating evolution towards novel functions.
Overall, the subfamily division of GH5 provides an actively curated resource for large-scale protein sequence annotation for glycogenomics; the subfamily assignments are openly accessible via the Carbohydrate-Active Enzyme database at
Protein evolution; Enzyme evolution; Functional prediction; Glycogenomics; Glycoside hydrolase family 5; Phylogenetic analysis; Subfamily classification
Softwood is the predominant form of land plant biomass in the Northern hemisphere, and is among the most recalcitrant biomass resources to bioprocess technologies. The white rot fungus, Phanerochaete carnosa, has been isolated almost exclusively from softwoods, while most other known white-rot species, including Phanerochaete chrysosporium, were mainly isolated from hardwoods. Accordingly, it is anticipated that P. carnosa encodes a distinct set of enzymes and proteins that promote softwood decomposition. To elucidate the genetic basis of softwood bioconversion by a white-rot fungus, the present study reports the P. carnosa genome sequence and its comparative analysis with the previously reported P. chrysosporium genome.
P. carnosa encodes a complete set of lignocellulose-active enzymes. Comparative genomic analysis revealed that P. carnosa is enriched with genes encoding manganese peroxidase, and that the most divergent glycoside hydrolase families were predicted to encode hemicellulases and glycoprotein degrading enzymes. Most remarkably, P. carnosa possesses one of the largest P450 contingents (266 P450s) among the sequenced and annotated wood-rotting basidiomycetes, nearly double that of P. chrysosporium. Along with metabolic pathway modeling, comparative growth studies on model compounds and chemical analyses of decomposed wood components showed greater tolerance of P. carnosa to various substrates including coniferous heartwood.
The P. carnosa genome is enriched with genes that encode P450 monooxygenases that can participate in extractives degradation, and manganese peroxidases involved in lignin degradation. The significant expansion of P450s in P. carnosa, along with differences in carbohydrate- and lignin-degrading enzymes, could be correlated to the utilization of heartwood and sapwood preparations from both coniferous and hardwood species.
Phanerochaete carnosa; Comparative genomics; Phanerochaete chrysosporium; Softwood degradation
Pectins are diverse and very complex biomolecules and their structure depends on the plant species and tissue. It was previously shown that derivatives of pectic polymers and oligosaccharides from pectins have positive effects on human health. To obtain specific pectic oligosaccharides, highly defined enzymatic mixes are required. Filamentous fungi are specialized in plant cell wall degradation and some produce a broad range of pectinases. They may therefore shed light on the enzyme mixes needed for partial hydrolysis.
The growth profiles of 12 fungi on four pectins and four structural elements of pectins show that the presence/absence of pectinolytic genes in the fungal genome clearly correlates with their ability to degrade pectins. However, this correlation is less clear when we zoom in to the pectic structural elements.
This study highlights the complexity of the mechanisms involved in fungal degradation of complex carbon sources such as pectins. Mining genomes and comparative genomics are promising first steps towards the production of specific pectinolytic fractions.
Microbial communities carry out the majority of the biochemical activity on the planet, and they play integral roles in processes including metabolism and immune homeostasis in the human microbiome. Shotgun sequencing of such communities' metagenomes provides information complementary to organismal abundances from taxonomic markers, but the resulting data typically comprise short reads from hundreds of different organisms and are at best challenging to assemble comparably to single-organism genomes. Here, we describe an alternative approach to infer the functional and metabolic potential of a microbial community metagenome. We determined the gene families and pathways present or absent within a community, as well as their relative abundances, directly from short sequence reads. We validated this methodology using a collection of synthetic metagenomes, recovering the presence and abundance both of large pathways and of small functional modules with high accuracy. We subsequently applied this method, HUMAnN, to the microbial communities of 649 metagenomes drawn from seven primary body sites on 102 individuals as part of the Human Microbiome Project (HMP). This provided a means to compare functional diversity and organismal ecology in the human microbiome, and we determined a core of 24 ubiquitously present modules. Core pathways were often implemented by different enzyme families within different body sites, and 168 functional modules and 196 metabolic pathways varied in metagenomic abundance specifically to one or more niches within the microbiome. These included glycosaminoglycan degradation in the gut, as well as phosphate and amino acid transport linked to host phenotype (vaginal pH) in the posterior fornix. An implementation of our methodology is available at http://huttenhower.sph.harvard.edu/humann. This provides a means to accurately and efficiently characterize microbial metabolic pathways and functional modules directly from high-throughput sequencing reads, enabling the determination of community roles in the HMP cohort and in future metagenomic studies.
The human body is inhabited by trillions of bacteria and other microbes, which have recently been studied in many different habitats (including gut, mouth, skin, and urogenital) by the Human Microbiome Project (HMP). These microbial communities were assayed using high-throughput DNA sequencing, but it can be challenging to determine their biological functions based solely on the resulting short sequences. To reconstruct the metabolic activities of such communities, we have developed HUMAnN, a method to accurately infer community function directly from short DNA reads. The method's accuracy was validated using a collection of synthetic microbial communities. Applying HUMAnN to data from the HMP, we showed that, unlike individual microbial species, many metabolic processes were present among all body habitats. However, the frequencies of these processes varied dramatically, and some were highly enriched within individual habitats to provide niche specialization (e.g. in the gut, which is abundant in food matter but low in oxygen). Other community functions were linked specifically to properties of the human host, such as biochemical processes only present in vaginal habitats with particularly high or low pH. Studying additional environmental or disease-associated communities using HUMAnN will further improve our understanding of how the microbial organisms in a community are linked to the biological processes they carry out.
The various ecological habitats in the human body provide microbes a wide array of nutrient sources and survival challenges. Advances in technology such as DNA sequencing have allowed a deeper perspective into the molecular function of the human microbiota than has been achievable in the past. Here we aimed to examine the enzymes that cleave complex carbohydrates (CAZymes) in the human microbiome in order to determine (i) whether the CAZyme profiles of bacterial genomes are more similar within body sites or bacterial families and (ii) the sugar degradation and utilization capabilities of microbial communities inhabiting various human habitats. Upon examination of 493 bacterial references genomes from 12 human habitats, we found that sugar degradation capabilities of taxa are more similar to others in the same bacterial family than to those inhabiting the same habitat. Yet, the analysis of 520 metagenomic samples from five major body sites show that even when the community composition varies the CAZyme profiles are very similar within a body site, suggesting that the observed functional profile and microbial habitation have adapted to the local carbohydrate composition. When broad sugar utilization was compared within the five major body sites, the gastrointestinal track contained the highest potential for total sugar degradation, while dextran and peptidoglycan degradation were highest in oral and vaginal sites respectively. Our analysis suggests that the carbohydrate composition of each body site has a profound influence and probably constitutes one of the major driving forces that shapes the community composition and therefore the CAZyme profile of the local microbial communities, which in turn reflects the microbiome fitness to a body site.
Microbial degradation of plant cell walls and its conversion to sugars and other byproducts is a key step in the carbon cycle on Earth. In order to process heterogeneous plant-derived biomass, specialized anaerobic bacteria use an elaborate multi-enzyme cellulosome complex to synergistically deconstruct cellulosic substrates. The cellulosome was first discovered in the cellulolytic thermophile, Clostridium thermocellum, and much of our knowledge of this intriguing type of protein composite is based on the cellulosome of this environmentally and biotechnologically important bacterium. The recently sequenced genome of the cellulolytic mesophile, Acetivibrio cellulolyticus, allows detailed comparison of the cellulosomes of these two select cellulosome-producing bacteria.
Comprehensive analysis of the A. cellulolyticus draft genome sequence revealed a very sophisticated cellulosome system. Compared to C. thermocellum, the cellulosomal architecture of A. cellulolyticus is much more extensive, whereby the genome encodes for twice the number of cohesin- and dockerin-containing proteins. The A. cellulolyticus genome has thus evolved an inflated number of 143 dockerin-containing genes, coding for multimodular proteins with distinctive catalytic and carbohydrate-binding modules that play critical roles in biomass degradation. Additionally, 41 putative cohesin modules distributed in 16 different scaffoldin proteins were identified in the genome, representing a broader diversity and modularity than those of Clostridium thermocellum. Although many of the A. cellulolyticus scaffoldins appear in unconventional modular combinations, elements of the basic structural scaffoldins are maintained in both species. In addition, both species exhibit similarly elaborate cell-anchoring and cellulosome-related gene- regulatory elements.
This work portrays a particularly intricate, cell-surface cellulosome system in A. cellulolyticus and provides a blueprint for examining the specific roles of the various cellulosomal components in the degradation of complex carbohydrate substrates of the plant cell wall by the bacterium.
Cellulosomics; Clostridium thermocellum; Scaffoldin; Cohesin; Dockerin
Understanding how the human gut microbiota and host are impacted by probiotic bacterial strains requires carefully controlled studies in humans and in mouse models of the gut ecosystem where potentially confounding variables that are difficult to control in humans can be constrained. Therefore, we characterized the fecal microbiomes and metatranscriptomes of adult female monozygotic twin pairs through repeated sampling 4 weeks prior to, 7 weeks during, and 4 weeks following consumption of a commercially available fermented milk product (FMP) containing a consortium of Bifidobacterium animalis subsp. lactis, two strains of Lactobacillus delbrueckii subsp. bulgaricus, Lactococcus lactis subsp. cremoris, and Streptococcus thermophilus. In addition, gnotobiotic mice harboring a 15-species model human gut microbiota whose genomes contain 58,399 known or predicted protein-coding genes were studied prior to and after gavage with all five sequenced FMP strains. No significant changes in bacterial species composition or in the proportional representation of genes encoding known enzymes were observed in the feces of humans consuming the FMP. Only minimal changes in microbiota configuration were noted in mice following single or repeated gavage with the FMP consortium. However, RNA-Seq analysis of fecal samples and follow-up mass spectrometry of urinary metabolites disclosed that introducing the FMP strains into mice results in significant changes in expression of microbiome-encoded enzymes involved in numerous metabolic pathways, most prominently those related to carbohydrate metabolism. B. animalis subsp. lactis, the dominant persistent member of the FMP consortium in gnotobiotic mice, upregulates a locus in vivo that is involved in the catabolism of xylooligosaccharides, a class of glycans widely distributed in fruits, vegetables and other foods, underscoring the importance of these sugars to this bacterial species. The human fecal metatranscriptome exhibited significant changes, confined to the period of FMP consumption, that mirror changes in gnotobiotic mice, including those related to plant polysaccharide metabolism. These experiments illustrate a translational research pipeline for characterizing the effects of fermented milk products on the human gut microbiome.
Mycoparasitism, a lifestyle where one fungus is parasitic on another fungus, has special relevance when the prey is a plant pathogen, providing a strategy for biological control of pests for plant protection. Probably, the most studied biocontrol agents are species of the genus Hypocrea/Trichoderma.
Here we report an analysis of the genome sequences of the two biocontrol species Trichoderma atroviride (teleomorph Hypocrea atroviridis) and Trichoderma virens (formerly Gliocladium virens, teleomorph Hypocrea virens), and a comparison with Trichoderma reesei (teleomorph Hypocrea jecorina). These three Trichoderma species display a remarkable conservation of gene order (78 to 96%), and a lack of active mobile elements probably due to repeat-induced point mutation. Several gene families are expanded in the two mycoparasitic species relative to T. reesei or other ascomycetes, and are overrepresented in non-syntenic genome regions. A phylogenetic analysis shows that T. reesei and T. virens are derived relative to T. atroviride. The mycoparasitism-specific genes thus arose in a common Trichoderma ancestor but were subsequently lost in T. reesei.
The data offer a better understanding of mycoparasitism, and thus enforce the development of improved biocontrol strains for efficient and environmentally friendly protection of plants.
This study is the first to use a metagenomics approach to characterize the phylogeny and functional capacity of the canine gastrointestinal microbiome. Six healthy adult dogs were used in a crossover design and fed a low-fiber control diet (K9C) or one containing 7.5% beet pulp (K9BP). Pooled fecal DNA samples from each treatment were subjected to 454 pyrosequencing, generating 503 280 (K9C) and 505 061 (K9BP) sequences. Dominant bacterial phyla included the Bacteroidetes/Chlorobi group and Firmicutes, both of which comprised ∼35% of all sequences, followed by Proteobacteria (13–15%) and Fusobacteria (7–8%). K9C had a greater percentage of Bacteroidetes, Fusobacteria and Proteobacteria, whereas K9BP had greater proportions of the Bacteroidetes/Chlorobi group and Firmicutes. Archaea were not altered by diet and represented ∼1% of all sequences. All archaea were members of Crenarchaeota and Euryarchaeota, with methanogens being the most abundant and diverse. Three fungi phylotypes were present in K9C, but none in K9BP. Less than 0.4% of sequences were of viral origin, with >99% of them associated with bacteriophages. Primary functional categories were not significantly affected by diet and were associated with carbohydrates; protein metabolism; DNA metabolism; cofactors, vitamins, prosthetic groups and pigments; amino acids and derivatives; cell wall and capsule; and virulence. Hierarchical clustering of several gastrointestinal metagenomes demonstrated phylogenetic and metabolic similarity between dogs, humans and mice. More research is required to provide deeper coverage of the canine microbiome, evaluate effects of age, genetics or environment on its composition and activity, and identify its role in gastrointestinal disease.
canine gut; gastrointestinal bacteria; metagenomics; pyrosequencing
Co-evolution of mammals and their gut microbiota has profoundly effected their radiation into myriad habitats. We used shotgun sequencing of microbial community DNA and targeted sequencing of bacterial 16S rRNA genes to understand how microbial communities adapt to extremes of diets, sampling fecal DNAs from 33 mammalian species and 18 humans who kept detailed diet records. We found that microbiota adaptation to diet is reproducible across different mammalian lineages. Functional repertoires of microbiome genes, such as those encoding carbohydrate-active enzymes and proteases, can be predicted from bacterial species assemblages. These results illustrate the value of characterizing vertebrate gut microbiomes to fully understand host evolutionary histories at a supra-organismal level.
Lateral gene transfer (LGT) between bacteria constitutes a strong force in prokaryote evolution, transforming the hierarchical tree of life into a network of relationships between species. In contrast, only a few cases of LGT from eukaryotes to prokaryotes have been reported so far. The distal animal intestine is predominantly a bacterial ecosystem, supplying the host with energy from dietary polysaccharides through carbohydrate-active enzymes absent from its genome. It has been suggested that LGT is particularly important for the human microbiota evolution. Here we show evidence for the first eukaryotic gene identified in multiple gut bacterial genomes. We found in the genome sequence of several gut bacteria, a typically eukaryotic glycoside-hydrolase necessary for starch breakdown in plants. The distribution of this gene is patchy in gut bacteria with presence otherwise detected only in a few environmental bacteria.
We speculate that the transfer of this gene to gut bacteria occurred by a sequence of two key LGT events; first, an original eukaryotic gene was transferred probably from Archaeplastida to environmental bacteria specialized in plant polysaccharides degradation and second, the gene was transferred from the environmental bacteria to gut microbes.
eukaryote-to-prokaryote LGT; DPE2; Bacteroides sp; gut microbiota; GH77
Filamentous fungi are potent biomass degraders due to their ability to thrive in ligno(hemi)cellulose-rich environments. During the last decade, fungal genome sequencing initiatives have yielded abundant information on the genes that are putatively involved in lignocellulose degradation. At present, additional experimental studies are essential to provide insights into the fungal secreted enzymatic pools involved in lignocellulose degradation.
In this study, we performed a wide analysis of 20 filamentous fungi for which genomic data are available to investigate their biomass-hydrolysis potential. A comparison of fungal genomes and secretomes using enzyme activity profiling revealed discrepancies in carbohydrate active enzymes (CAZymes) sets dedicated to plant cell wall. Investigation of the contribution made by each secretome to the saccharification of wheat straw demonstrated that most of them individually supplemented the industrial Trichoderma reesei CL847 enzymatic cocktail. Unexpectedly, the most striking effect was obtained with the phytopathogen Ustilago maydis that improved the release of total sugars by 57% and of glucose by 22%. Proteomic analyses of the best-performing secretomes indicated a specific enzymatic mechanism of U. maydis that is likely to involve oxido-reductases and hemicellulases.
This study provides insight into the lignocellulose-degradation mechanisms by filamentous fungi and allows for the identification of a number of enzymes that are potentially useful to further improve the industrial lignocellulose bioconversion process.
Filamentous fungi; genomes; lignocellulose; enzymatic hydrolysis; cellulases; oxido-reductases; glycosyl hydrolases; Ustilago maydis; mass spectrometry
Competition for nutrients contained in diverse types of plant cell wall-associated polysaccharides may explain the evolution of substrate-specific catabolic gene modules in common bacterial members of the human gut microbiota.
Symbiotic bacteria inhabiting the human gut have evolved under intense pressure to utilize complex carbohydrates, primarily plant cell wall glycans in our diets. These polysaccharides are not digested by human enzymes, but are processed to absorbable short chain fatty acids by gut bacteria. The Bacteroidetes, one of two dominant bacterial phyla in the adult gut, possess broad glycan-degrading abilities. These species use a series of membrane protein complexes, termed Sus-like systems, for catabolism of many complex carbohydrates. However, the role of these systems in degrading the chemically diverse repertoire of plant cell wall glycans remains unknown. Here we show that two closely related human gut Bacteroides, B. thetaiotaomicron and B. ovatus, are capable of utilizing nearly all of the major plant and host glycans, including rhamnogalacturonan II, a highly complex polymer thought to be recalcitrant to microbial degradation. Transcriptional profiling and gene inactivation experiments revealed the identity and specificity of the polysaccharide utilization loci (PULs) that encode individual Sus-like systems that target various plant polysaccharides. Comparative genomic analysis indicated that B. ovatus possesses several unique PULs that enable degradation of hemicellulosic polysaccharides, a phenotype absent from B. thetaiotaomicron. In contrast, the B. thetaiotaomicron genome has been shaped by increased numbers of PULs involved in metabolism of host mucin O-glycans, a phenotype that is undetectable in B. ovatus. Binding studies of the purified sensor domains of PUL-associated hybrid two-component systems in conjunction with transcriptional analyses demonstrate that complex oligosaccharides provide the regulatory cues that induce PUL activation and that each PUL is highly specific for a defined cell wall polymer. These results provide a view of how these species have diverged into different carbohydrate niches by evolving genes that target unique suites of available polysaccharides, a theme that likely applies to disparate bacteria from the gut and other habitats.
Bacteria inhabiting the human gut are critical for digestion of the plant-derived glycans that compose dietary fiber. Enzymes produced by the human body cannot degrade these abundant dietary components, and without bacterial assistance they would go unused. We investigated the molecular strategies employed by two species belonging to one of the most abundant bacterial groups in the human colon (the Bacteroidetes). Our results show that each species has evolved to degrade a unique subset of glycans; this specialization is reflected in their respective genomes, each of which contains numerous separate gene clusters involved in metabolizing plant fiber polysaccharides or glycans present in secreted mucus. Each glycan-specific gene cluster produces a related series of membrane-associated proteins which together serve to bind and degrade a specific glycan. Expression of each glycan-specific gene cluster is controlled by an environmental sensor that responds to the presence of a unique molecular signature contained in the substrate that it targets. These results provide a view of how related bacterial species have diverged into different carbohydrate niches by evolving genes that sense and degrade unique suites of available polysaccharides, a process that likely applies to disparate bacteria from the gut and other habitats.