Are the physical and chemical niche attributes of the gut (structural configuration, flow rate, temperature, pH, nature of substrates) the principal factors governing microbial community composition of the mammal gut? Are all mammals, when born germ-free, best viewed as empty vessels with no host-mediated control over the inhabitants of the gut? To address these questions, we have placed the mammalian gut microbiotas into the greater context of `free-living' microbial communities not associated with the body surfaces of multicellular eukaryotes and those associated with non-vertebrates, non-mammal vertebrates, and other human habitats, using published 16S rRNA surveys. This dataset combines the gut mammalian dataset described above32
with the `environmental dataset' of Lozupone and Knight (2007), composed of 202 samples from 111 published studies of diverse free-living communities, including soils, seawater, hotsprings, sediments and lakes. The analysis of the free-living communities showed that microbial consortia of the biosphere fell clearly into two main groups: those adapted to saline conditions, and those adapted to non-saline conditions5
. Another finding was that habitats in different locales harbored similar communities. Thus, despite the potential of horizontal gene transfer to confer any function to any lineage, it appears that the same phylogenetic lineages are performing the same functions under similar conditions in different places. The patterns observed for the gut microbiotas are generally similar: mammals with similar diets harbor similar microbiotas.
To allow for comparisons between the mammal gut and the guts of other vertebrate and non-vertebrates, and human associated habitats, we augmented this combined dataset substantially with data from other published studies (Table S1
). To the gut mammalian dataset, which consisted entirely of fecal samples from healthy individuals, we added gut samples retrieved from mucosal tissue and rumen fluids, and from adult and infant humans with a range of physiologic and pathophysiologic phenotypes (obesity, antibiotic-resistant diarrhea, colonic diverticulosis). We augmented the `environmental dataset' with 34 samples from large sequencing efforts of free-living communities that were published after our initial analysis. In addition, we added samples from other human body habitats (including the mouth, skin, ear, vulva, and vagina), from the guts of non-mammal vertebrates, such as poultry and zebrafish, and from the gut or whole body of diverse metazoa, including termites, beetles, lice, earthworms, fruit flies, mosquitoes, bees, gypsy moth larvae, corals and sponges. The final dataset consisted of 99,801 16S rRNA sequences from 464 samples and 181 studies (see Table S1
We anticipated that if physical and chemical attributes of the gut habitat are principal shapers of microbial community composition, those free-living communities most similar to the mammalian gut communities might be, for instance, those associated with anoxic environments with high levels of complex polysaccharides (for example, anoxic soils or bogs). We also expected that gut communities would be far less different from one another than communities from the other environments, given that the temperature, pH and other physical-chemical parameters are much more constrained.
The patterns that emerge from the combined dataset are very different from what we anticipated. Bacterial communities that occupy the majority of vertebrate guts are markedly different from non-animal (free-living) bacterial communities. Principal coordinate analysis (PCoA) based on UniFrac distances clearly separates bacterial communities obtained from the vertebrate gut from other types of communities (). What is remarkable is that the separation along PC1 (the principal coordinate that accounts for the largest amount of variance between samples) of vertebrate gut-associated communities from free-living communities is more than twice as great as the separation between saline and non-saline communities, which was evident over PC3 (). Samples that had intermediate values along this PC1 are from the human mouth, vaginal epithelium, and vulva, plus the guts of carnivorous vertebrates () and herbivorous/omnivorous bears, indicating that microbes that perform fermentation-based degradation of plant material in the intestinal tract contribute greatly to the extreme divergence of vertebrate gut communities. Moreover, one sample from an anoxic rice paddy soil76
clusters in an intermediate position between the free-living and vertebrate gut samples along PC1, supporting the notion that the anaerobic, polysaccharide-rich nature of the gut environment could be related to the dichotomy.
Variance in bacterial community composition between samples from vertebrate-gut associated and other `free-living' communities
Another striking pattern to emerge from this analysis is the distinction between vertebrate and invertebrate- associated communities. Almost all of the invertebrate gut communities cluster with the free-living communities, with the exception of termites and most of the samples from beetle larvae. The beetle samples that cluster between the noncarnivorous vertebrates and the free-living communities were all from the specialized anaerobic hindgut region of beetles with differentiated guts42,43
, whereas a beetle sample from the whole gut of larvae of Anoplophora glabripennis44
clustered with the free-living communities. Similarly, the specialized gut structures of certain beetle taxa have been associated with methane production, and the presence of methanogens in terrestrial arthropods in general was found to be associated with taxon-specific traits45
. These findings suggest a strong host phylogenetic effect on the structure of the microbiota of arthropods, as in mammals. This clustering is also consistent with the theme that one key factor that shapes gut differentiation is the provision of an anaerobic environment with abundant complex carbohydrates from plant materials.
Other nonvertebrate-associated communities, such as those from adult and larval bees, gypsy moth larvae, whole fruit flies, and the earthworm gut, clustered with the free-living communities. Exceptions included the casts of earthworms, which clustered more closely with soil. This section of the PC plot ( top left), shows an aggregation of bacterial communities associated with diverse complex organisms, such as bacteria that tightly associate with plant roots, the human skin, outer ear, and vulva, plus a subset of the coral and sponge samples. The free-living communities that also cluster in the top left section of the plot in were almost exclusively from studies that used a culturing step prior to PCR-amplification of 16S rRNA genes. This association suggests that these animal-associated communities are composed of r-selected organisms (i.e. fast growing or `weedy') that can quickly utilize readily available nutrients. Indeed a high copy number of 16S rRNA genes, which correlates with fast growth rates45
, is typical of microbes that are common in the vertebrate gut46
. These observations suggest that r-selection may have been important for the earliest associations of bacteria with animal guts.
Finally, the split between saline and non-saline environmental communities extends to non-vertebrates that inhabit saline and non-saline habitats. The third principal component (PC3) in this analysis clearly separated the saline from the non-saline free-living communities - a split that had been previously described5
. Mirroring this pattern, in the expanded dataset, the marine sponges and corals harbor bacterial communities most similar to free-living communities associated with saline environments, and terrestrial non-vertebrate hosts (e.g., earthworms, bees, the gypsy moth, fruit flies, chewing lice, and beetles) harbor communities more similar to the free-living communities from non-saline habitats along this PC axis ().
A deep dichotomy replicated in multiple bacterial lineages
The distribution of phyla takes on very different patterns in the gut than in other types of habitats (). Across vertebrate gut samples, including the human gut, the Firmicutes and Bacteroidetes are the numerically dominant phyla. Although other phyla can make up a large proportion of the sequences recovered in some hosts (e.g., Actinobacteria in sheep), overall the Firmicutes and Bacteroidetes are the most ubiquitous and common (see ). Moreover, although the Firmicutes and Bacteroidetes comprise a substantive fraction of a majority of communities in other types of habitats, other phyla tend to be more highly represented in non-gut samples. One important question is thus whether the UniFrac clustering patterns are due to differential representation of different phyla in different communities, or whether the patterns that relate groups of environments are replicated within each bacterial phylum.
Relative abundance of phyla in samples
The dichotomy between vertebrate gut and free-living communities observed at the whole-community level was indeed evident within the constituent phyla. We performed phylum-specific, UniFrac-based PCoA analyses for the three phyla most highly represented across all 462 gut- and non-gut associated microbial community samples: the Bacteroidetes, Firmicutes and Proteobacteria. Within the Bacteroidetes division alone, PC1 again separates the vertebrate and termite-associated gut microbiotas from free-living bacterial communities, and PC3 separates the saline and non-saline free-living communities (). Remarkably, the variation between samples is greater for the vertebrate-gut associated Bacteroidetes than for the free-living Bacteroidetes. The dichotomy between vertebrate gut and free-living communities is also evident when the analysis is applied to the Proteobacteria and Firmicutes individually ().
The phylum-specific analysis also helps to explain why samples from the guts of carnivores tend to cluster closer to free-living communities in the full analysis. Although this similarity may be attributed in part to the deficit of Bacteroidetes (: most of the carnivores drop out of the analysis), even the Firmicutes within the carnivores are more similar to those in free-living communities than those that reside in the guts of herbivores and omnivores ().
Genera that cross the divide
Another way to visualize the vertebrate gut/environmental dichotomy is with a network diagram that displays, in addition to the clustering of hosts with similar microbiotas, the bacterial genera that they share. In this representation of the data, the vertebrate gut samples are far more connected to one another than to the environmental samples (). As in the UniFrac-based analysis, the non-gut human samples also occupy an intermediate position between the free-living and the gut communities. shows the phylogenetic classification of the operational taxonomic units (OTUs) that are shared between samples: among humans, these are overwhelmingly Firmicutes, with some Bacteroidetes. In contrast, the free-living communities share OTUs from a wider range of phyla. Samples obtained from the guts of obese humans cluster away from the samples from healthy subjects and these are linked principally by Firmicutes: this observation is consistent with the finding that samples from obese individuals have a higher representation of the Firmicutes than do lean subjects31
Network analysis of bacterial communities from animal-associated and free-living communities
Network analysis of bacterial communities from animal-associated and free-living communities where host node colors are all white, and genus-level OTUnodes are colored according to their phylogenetic classification at the phylum level.
Bacterial genera that connect the vertebrate-gut associated microbiotas to the free-living communities by inhabiting both can be viewed as cosmopolitan. As these analyses mainly capture the dominant members of a microbiota, these genera are presumed to grow and subsist in that environment (autochthonous members), and not simply be passing through (allochthonous members). Among these cosmopolitan groups were the Pseudomonadaceae family of the Gammaproteobacteria lineage, which contained OTUs detected both in the vertebrate gut, and free-living in saline and non-saline habitats. The Enterobacteriales (Gammaproteobacteria) were detected in the vertebrate gut, termite gut and in other invertebrates, but also in a surface soil sample and anoxic saline water. The Staphylococcaceae (Bacilli, Firmicutes), were common in the vertebrate gut samples but were also detected in soil and cultures derived from freshwater and saline habitats. Finally, members of Fusobacterium were detected in saltwater sediments in addition to the vertebrate gut. The cosmopolitan distribution of these organisms may have made them particularly important for introduction of novel functions during evolution of the gut microbiota, as they can bring new useful genes from the global microbiome into the gut microbiome via horizonal gene transfer. [A caveat: some of OTUs that are very common in humans and that occur at very low abundance in free-living communities may be contaminants of environmental samples introduced during handling47
In summary, the gut/non-gut dichotomy in community composition is evident across the bacterial tree, within phyla, and manifests as distinct sets of genera. This leads to the question of what types of selective pressures act on these many diverse lineages of gut microbes, driving them to `differentiate' into gut and non-gut groups?