|Home | About | Journals | Submit | Contact Us | Français|
Over the course of our lives, humans are colonized by a tremendous diversity of commensal microbes, which comprise the human microbiome. The collective genetic potential (metagenome) of the human microbiome is orders of magnitude more than the human genome, and it profoundly affects human health and disease in ways we are only beginning to understand. Advances in computing and high-throughput sequencing have enabled population-level surveys such as MetaHIT and the recently-released Human Microbiome Project, detailed investigations of the microbiome in human disease, and mechanistic studies employing gnotobiotic model organisms. The resulting knowledge of human microbiome composition, function, and range of variation across multiple body sites has begun to assemble a rich picture of commensal host-microbe and microbe- microbe interactions as well as their roles in human health and disease and their potential as diagnostic and therapeutic tools.
Humans consist only of our own somatic cells until birth, but over the first several years of life, our bodies, including the skin surface, oral cavity, and gut, are colonized by an enormous variety of bacteria, archaea, fungi, and viruses, which form a community collectively known as the human microbiome or microbiota [1–3]. Our microbiome contains ten times as many cells as the rest of our bodies, and orders of magnitude more genes than the human genome [1, 4]. Normally these microbes are commensal or mutualists, helping to digest our food and maintain our immune systems. Although the human microbiome has long been understood to influence human health and disease, low-cost high-throughput sequencing has only recently given us the experimental tools to investigate the breadth of its involvement. This has resulted in an explosion of recent work illuminating the role of the microbiome in conditions ranging from obesity [5, 6] to inflammatory bowel disease (IBD) [7, 8] to kidney stones . Large-scale surveys such as the Metagenomics of the Human Intestinal Tract (MetaHIT) consortium  and the recent Human Microbiome Project (HMP) [1, 11] have shown that these are only the beginning of a richer functional understanding of the human microbiome.
Although human-associated microbes were among the first investigated by microscopy, most are difficult or impossible to culture under convenient laboratory conditions. The vast majority of microbial taxa were uncultured and thus minimally studied until the development of DNA-based techniques. The earliest high-throughput technique for microbial ecology (and still the most common) is to sequence the highly-conserved 16S rRNA gene. Also known as 16S gene (or amplicon) profiling, this provides a rapid and very cost-effective survey of the bacteria present within a community , but it is rarely informative with regard to non-bacterial microbes and offers little functional information. Historically, this gene was amplified with universal bacterial primers, and the individual sequences were cloned and determined using Sanger sequencing. More recent 16S sequencing techniques are very high-throughput, generating up to millions of reads per sample, but with typically very short read lengths .
Metagenomic studies, in which the DNA of an entire community is sequenced, provide a much broader and more complex picture of microbial communities by simultaneously cataloging community organisms (including the non-bacterial component), genes, and genomes (Figure 1). The earliest large-scale metagenomic studies began soon after the turn of the century [14, 15], but the expense of clone library construction and Sanger sequencing made metagenomic human microbiome studies unfeasible for most groups. The advent of high-throughput sequencing made much larger-scale metagenomic efforts, including MetaHIT  and the HMP [1, 11], economically feasible. Here we review what these and recent disease- and model-organism studies have shown us about the composition and genetic capabilities of the human microbiome and discuss the future of the field of human microbiome research.
The largest early efforts at measuring the composition of the human microbiome were 16S cloning-based studies of the gut that included fecal and mucosal samples from up to a dozen individuals [16, 17]. These resulted in 10–20,000 sequences each, corresponding
to between 400 and 4,000 taxa (this range of species counts reflects largely bioinformatic rather than biological differences between studies). These investigations showed that gut alpha diversity was much higher than most previous estimates and that beta diversity was high both between subjects as well as between mucosal and stool samples from the same subject . This, combined with preliminary associations between altered microbiota and disease, highlighted the need for much larger-scale studies to determine the range of microbial variation and function in the human gut across entire populations.
The first population-scale metagenomic study of the human microbiome was conducted by the MetaHIT consortium . Stool sequenced from 124 Spanish and Danish subjects were estimated to contain 1,150 common bacterial species in the human gut with a collective 3.3 million genes. However, only 75 of these organisms and 294,000 genes were shared by more than half of subjects, even within a population with limited genetic and environmental diversity. In addition to essential housekeeping genes, common gut bacteria were enriched for pathways that metabolized complex sugars from the host diet and intestinal lining, as well as for adhesion and vitamin and xenobiotic processing. The abundance of archaeal and fungal components was very low (<1%), although as in all such studies, this is influenced by sample collection and DNA extraction .
The HMP was by far the largest human microbiome study to date, a five-year effort that determined the range of microbial diversity in disease-free individuals by sequencing over 5,000 samples from approximately 250 healthy volunteers in two U.S. cities. Each subject was sampled at 15 or 18 body sites (nine oral, four skin, one nasal, one stool, three vaginal), and roughly half of subjects were sampled at up to two additional time points. Most samples were profiled using 16S amplicon sequencing to determine community composition; approximately 15% were subjected to metagenomic sequencing to determine community function [1, 11]. In addition to expanding the MetaHIT gut gene catalogue by nearly 1.8 million genes and identifying a microbiome-wide count of over 10 million total unique microbial genes, the HMP allowed detection of between-body-site microbial associations.
Estimating a community's species richness is bioinformatically challenging. Thus, the total human microbiome assessed by the HMP contained between 3,500 and 35,000 species-level Operational Taxonomic Units (OTUs) depending on parameter choices . These, however, confidently spanned roughly 600 genera and covered 90% of the phylogenetic range of microbes expected in this population [1, 19]. Several signature bacterial genera were observed across nearly all individuals and represented the plurality (and often majority) of particular body site communities (Figure 2), but specific taxa were highly variable and almost never universal . The gastrointestinal tract, including both the oral cavity and stool, was the most diverse microbiome among the sampled body habitats, estimated to contain several thousand OTUs corresponding to approximately 150 genera . The well-studied gut microbiome exemplified inter-individual variability; the Bacteroidetes and Firmicutes phyla dominated as expected, but abundance ranged from >90% Bacteroidetes to >90% Firmicutes in different subjects. Only some half dozen gut taxa were near-universal (present in >=95% of subjects) .
All body areas examined by the HMP dramatically differed not only in composition, but in ecological organization as well (see Text Box 1). The oral microbiome was essentially as diverse as the gut but, by contrast, more of this diversity was shared between individuals. Signature clades also varied strikingly even between very similar oral surfaces (e.g., mucosa, saliva, and plaque); abundant genera included Streptococcus, Haemophilus Veillonella, Actinomyces, and Fusobacterium. Skin diversity (sampled at the inner elbows and behind the ear) was low both within and between subjects, perhaps due to exposure and frequent disruption . The human body community of lowest alpha diversity was that of the vagina; most women were dominated by one of four single species of Lactobacillus . The vaginal microbiome becomes even less diverse during pregnancy , and it is emerging as a useful human-associated model community due to its uniqueness relative to other primates  and highly structured temporal fluctuations [24, 25].
The HMP discovered and validated several novel taxa, mostly at the genus level, but hinted at a possible novel family within the order Clostridiales . Novel OTUs were generally of low abundance (<2%), a level comparable to that of E. coli in healthy subjects , but were carried across multiple subjects and multiple visits. The novel taxa were mainly from the largely-uncharacterized Barnesiella genus, but also included novel Dorea Oscillibacterand Desulfovibriogenera which are associated broadly with colon cancer, dietary shifts, and opportunistic infections, respectively . Low-abundance community members can affect health even at low numbers  and in the worst case, under certain circumstances, can overgrow to the detriment of host health (e.g., Clostridium difficile). Further study may reveal important roles for these novel taxa. However, as the microbial variation between individuals was greater than that of samples from the same subject at different points in time , these low-abundance microbes at the very least contribute to a uniquely personalized human microbiome when considered in terms of microbial presence and abundance.
Human microbiome composition appears to be particularly dependent on age and geography, although both are confounded by a variety of dietary, developmental, environmental, and genetic factors. The HMP included only American adults between the ages of 18 to 40, and MetaHIT extended this somewhat with Spanish and Danish individuals up to age 70. The infant microbiome is of particular interest with respect to acquisition, immune training, and extreme dietary changes over the first few years of life [28, 29]. Some of the earliest studies of microbiome structure and function indicated that immediate acquisition occurs from the environment and is dependent on mode of delivery , with stabilization to a more adult-like community occurring over two to three years . Functionally, the shift from the newborn to adult-like gut microbiome includes enrichment for anaerobic fermentation, complex carbohydrate degradation (e.g., glycosyl hydrolases), and a depletion of simple sugar (lactose/galactose/sucrose) transport and metabolism proteins . Recent studies of the microbiota of the elderly found that frailty is correlated with an overall decrease in microbial diversity, with overall increases in Bacteroides and decreases in Firmicutes that corresponded to a decrease in glutarate and the anti-inflammatory short-chain fatty acid butyrate .
Another recent comprehensive study addressed several of these questions simultaneously by surveying the stool microbiomes of a highly-diverse population that included 314 Americans, 114 Malawians, and 100 Amerindians. Roughly two-thirds of the subjects were under the age of 17, while the remaining third were between the ages of 18 and 70 . In all three populations of this study, inter-subject variability was higher in children than in adults, infant microbiomes were dominated by Bifidobacteriaand microbiomes reached an adult-like configuration by age three. Microbial diversity increased with age in all populations, but both Malawian and Amerindian adults achieved higher diversity than American adults. The Malawian and Amerindian microbiomes were also more similar to one another than to the U.S. microbiome, in both adults and children .
When the authors examined functional gene families and enzyme classes within these global gut metagenomes, no enzyme classes were wholly unique to adults or babies, and the total number of enzyme classes was similar in both groups. However, the adults had many more uncharacterized enzymes than babies, likely reflecting the age-related increase in diversity. The infant microbiome had higher levels of folate-producing enzymes than that of adults, likely reflecting adults’ ability to obtain folate from diet. Conversely, biosynthetic capacity for the vitamins cobalamin, thiamine, and biotin increased with age. Urease genes were more frequently observed in Malawian and Amerindian infant microbiomes, possibly corresponding to greater probability of dietary nitrogen deficiency. Malawian and Amerindian adult microbiomes were enriched in glutamate synthase relative to the American microbiomes, the latter instead carrying an increased capacity for degradation of glutamine and other amino acids. Interestingly, this functional difference has also been observed when comparing the gut microbiota of herbivores and carnivores [32
Several human microbiome studies have reported the metagenomic distribution of pathways within each body habitat to be much more consistent among individuals than are microbial abundances [1, 6, 10, 33]. In each case, microbial abundances within the same habitat (gut, oral cavity, etc.) varied tremendously among subjects, but two types of "core" pathways were stable in abundance. The first, corresponding to a more-or-less universal set of housekeeping genes like those reported by MetaHIT , were present in every habitat and represent processes necessary for human-associated microbial life, such as the ribosomal machinery, ATP synthesis, and glycolysis.
More surprisingly, certain sets of site-specific gene function were maintained within each habitat regardless of the taxa present. The stool microbiome was particularly abundant in genes related to complex carbohydrate degradation despite highly variable Bacteroidetes:Firmicutes ratios; hydrogen sulfide production and methionine degradation were also site-specific to the gut at low abundance . Individual profiles of carbohydrate-active enzymes within each body habitat were also very similar by site. The oral cavity microbiome, for example, was optimized for simple sugar metabolism and particularly for dextran, whereas the vaginal microbiome was optimized for glycogen and peptidoglycan degradation . Much like individual bacterial genomes, each habitat thus seems to have a core metagenome present in most hosts, in addition to a pan- metagenome of more flexible auxiliary genes carried by each habitat’s community.
Much of our knowledge of microbiome influence in human health comes from gnotobiotic systems, which predate the field of metagenomics . Gnotobiotics are animals (typically mice or zebrafish) in which normal host microbiota has been replaced by a defined set of microbes, allowing studies of how microbial colonization influences host health and development or other processes of interest. Gnotobiotic animals are born under germ-free conditions and subsequently colonized (typically by gavage) with a single species , a defined set of species such as the eight-species Schaedler flora, or entire gut communities from mice or humans with phenotypes of interest [35, 37]. Gnotobiotic studies have shown that gut microbiota affects the host in many ways, including fertility, the development of the heart, lungs, and liver , and even brain gene expression and behavior. For example, a recent study found that germ-free mice were less anxious and more active than mice with normal gut microbiota, but these effects were largely reversed by exposure to gut microbiota at an early age . Three areas of particularly intense focus are the heavy influence of the microbiome on nutrition  and obesity , immune system development [36, 39–41], and inflammatory bowel disease [7, 8, 42].
The earliest attempts at raising germ-free animals demonstrated the vital contribution of the gut microbiota to nutrition, as the animals rapidly succumbed to vitamin deficiencies and required extensive supplementation . When germ-free mice were colonized with gut bacteria from conventionally-raised adult mice, their food intake decreased, but their body fat increased due to increased monosaccharide availability from food signaling the liver to increase fat storage . Studies of mice genetically predisposed to obesity found that they had more Firmicutes and fewer Bacteroidetes than their genetically-lean siblings, and when a group of germ-free mice were colonized with microbiota from obese mice, they gained more weight than those colonized with microbiota from lean mice . Furthermore, mice colonized with human gut microbiota showed large and rapid shifts in gut microbiome composition in response to diet; these were accompanied by stable changes in microbial taxa abundance and gene expression .
The results of human obesity studies have proven more variable than the mouse studies; Bacteroides have been reported by various groups to increase, decrease, and not change during weight loss . Other controlled feeding studies suggest that short-term dietary changes can modify taxon abundances  or transcriptional levels  but seldom change the presence or absence of specific taxa . Thus, it has initially been difficult to interpret the extreme changes often seen in gnotobiotic animals relative to the more subtle behavior in the human gut.
A study comparing the gut microbiota of normal weight, obese, and post-gastric-bypass subjects noted that obese individuals were more likely to harbor both Prevotella and methanogenic archaea . The authors proposed that methanogens removed hydrogen produced by Prevotella, allowing Prevotella to more efficiently produce short-chain fatty acids, which were absorbed by the host, contributing to host obesity . This proposal is supported by evidence from gnotobiotic mouse studies; mice co-colonized with Bacteroides thetaiotamicron (which, like Prevotellaferments carbohydrates into short- chain fatty acids and produces hydrogen) and the methanogen Methanobrevibacter smithii had higher numbers of total gut bacteria, higher acetate levels in the intestinal lumen and blood, and more body fat than monocolonized mice or mice co-cultured with B. thetaiotamicron and the sulfate-reducing bacteria Desulfovibrio piger .
In addition to nutrient absorption, the gut microbiome also greatly affects immune development. The immune systems of germ-free mice are strikingly abnormal, with smaller lymph nodes, lower serum immunoglobin levels, and lower levels of leukocytes than conventially-raised mice . Germ-free mice are particularly deficient in CD4+ T- cells [48, 49], but colonization with Bacteroides fragilis or even exposure to its polysaccharide restores CD4+ cells to normal levels . Monocolonization of germ- free mice with segmented filamentous bacteria (SFB), a mouse-commensal Clostridiasimultaneously stimulated many kinds of T-cells, including Th1, Th2, Th17, and Treg . This immune activation is not universally beneficial; SFB also triggers autoimmune arthritis in genetically-predisposed mice in a Th17-dependent manner, but they do not became arthritic if raised in germ-free conditions .
Gut/immune system interactions are not limited to CD4+ cells; natural killer (NK) cells accumulate in large numbers in the lungs and colons of germ-free mice, making them much more susceptible to experimental asthma and colitis. This NK accumulation cannot be reversed by exposing mice to germs later in development but is prevented by very early exposure to normal mouse gut flora . The continuum between early immune development, ongoing training, and acute or localized inflammation is still being explored. Some specific microbes, primarily Faecalibacterium prausnitziihave been suggested to act as suppressors of inflammation by short-chain fatty acid production and stimulation of the anti-inflammatory cytokine IL-10 .
In addition to its roles in obesity and immune response, it is increasingly evident that the microbiome influences the development of many other diseases (Table 1). IBD is only one example of a human disease that apparently does not follow Koch’s postulates, yet is intimately connected to host microbiome. It has long been known that IBD is both etiologically and microbially distinct, and that antibiotics, particularly rifampicin, are sometimes beneficial in its treatment . Rodent studies showed that rats and mice genetically-predisposed to IBD would not develop the disease if they were kept germ-free  or preemptively treated with antibiotics.. Furthermore, fostering or co-caging healthy mice with IBD-predisposed mice was sufficient to cause IBD in the healthy mice; this correlated with the transfer of the Enterobacteriaceae species Klebsiella pneumonia and Proteus mirabilis from the IBD mice to the healthy mice . Intriguingly, this colitis was attenuated by the administration of probiotic fermented milk containing Bifidobacterium spp. .
The relationships between human gut microbiota and IBD are less well understood, but it is widely accepted that human IBD is not caused by a specific pathogen, but rather an immune system imbalance with respect to normal gut bacteria. Metagenomic studies of IBD in the human gut have consistently shown that IBD is associated with gut dysbiosis  and reduced microbial diversity . A lack of Faecalibacterium prausnitzii was associated with high postoperative recurrence of Crohn’s disease, and F. prausnitzii supplementation attenuated experimental colitis in mice , but no human IBD F. prausnitzii supplementation studies have been reported to date. Many groups have observed decreases in members of anti-inflammatory Clostridia from clades IV and XIVa and increases in Escherichia coli , but it has not yet been determined whether these microbiome changes are a cause of IBD in humans, or are merely caused by IBD. Two probiotics, E. coli strain Nissle (isolated from an exceptionally- gastroenteritis-resistant World War 1 soldier) and VSL#3 (eight strains of Bifidobacterium Lactobacillusand Streptococcus) have been shown to be effective treatments for maintaining and inducing remission in ulcerative colitis , but the underlying molecular mechanisms are not yet well understood. Further probiotic research may shed light on how short- or long-term “ecosystem therapy” may restore balance to dysbiosis.
In addition to understanding the basic ecological principles of human microbiome organization and its life-long epidemiology, two major areas of microbiota function have recently begun to be explored. The first is diagnostic: does microbiota composition or genetic function predict human disease onset, progression, or outcome? The past decade has seen great strides in assessing disease history and predicting risk using high- dimensional gene expression and genetic biomarkers; the next decade is likely to see the same with respect to the microbiome [55, 56]. The microbiome's second area of tremendous potential is in therapeutic intervention; human genetic defects are now detectable but difficult to modify, whereas the microbiome is both measurable and plastic. Targeted narrow-spectrum antibiotics , drug-microbe interactions , probiotics [45, 52, 54], and dietary interventions  all show promise for beneficial microbiome modulation. Remarkably, the roles and mechanisms by which host genetics, early life events, geographical location, or transmission of the microbiota might operate all represent largely open questions.
The technologies available to realize these research goals continue to develop at a remarkable pace, and are rapidly approaching diagnostic utility. Sanger sequencing has been replaced by Roche 454, and this is in turn giving way to ultra-high-throughput Illumina sequencing [13, 60], with longer reads for microbial isolate genomes and community metagenomes on the horizon [61, 62]. Metagenomes measure only what a community is capable of doing, but developing functional assays will better assess how it dynamically responds to its environment. Metatranscriptomics presents additional technical challenges such as depletion of ribosomal RNA, but has already been used to describe marine  and fecal  communities. Metaproteomics, or quantification of community peptides, has likewise been steadily developing to describe, for example, the gut microbiota [65, 66]. Metametabolomics, or quantification of community small- molecule metabolites, has been used to compare differences in host blood plasma between germ-free and conventionally-raised mice  and to compare fecal samples from healthy individuals and those with Crohn’s disease . The immense computational challenge of assembling these vast and diverse meta’omic data to describe entire ecosystems, identifying how the genes interact with one another within and between bacteria to produce metabolites, and quantifying and understanding how these processes change over time and in response to the environment is the goal of the emerging field of molecular ecosystems biology . Integrating these new technologies with model organisms and culture-based approaches will help us to unravel our relationship with our microbial majority.
It is an exciting time for work on human-associated microbial communities, both for basic biological discovery and for translational research. Experimental and computational tools are at the cusp of widespread affordability and availability, driven both by large-scale projects such as the HMP and by the democratization of high- throughput genomics and bioinformatics. Large-scale metagenomic studies have shown that the human microbiome is highly variable between individuals, and its genetic capacity is far greater than the human genome. Bacterial community function is highly conserved between corresponding body sites in human hosts, while community gene transcription is much more variable; it all affects health in many ways we are just beginning to understand. The most productive future research will integrate the metagenome, metatranscriptome, metaproteome, and metametabolome to model entire ecosystems.
The human body comprises a set of unique ecosystems with a broad range of configurations. At one extreme, the low-diversity vaginal microbiome typically occurs in one of five discrete types: four are each dominated by a different signature species of Lactobacillus , and the fifth is characterized by the absence of Lactobacillus. At the other extreme, the highly-diverse human mouth microbiome has a complex core of prevalent but variably-abundant taxa . In communities such as the vagina, it is easy to perform class discovery by community clustering or typing; organizing the complexity of other human body habitats in this manner has been the subject of much recent work.
In the gut in particular, it has been hypothesized that the microbiome forms "enterotypes," or discrete, stable clusters of similar microbiome configurations . Like the mouth, the gut microbiome is highly diverse; a combination of environmental effects, early exposures, and long-term diet determine the overall community profile. Short-term dietary changes induce more modest and transient effects  realized primarily at the transcriptional level . The existence of human gut enterotypes and their associations with factors such as genetics or diet are much less straightforward than vaginal ecotypes and are an area of active research.
A recent study of individuals from six nations initially proposed three gut microbiome types, dominated by Prevotella Ruminococcus, and Bacteroides . Subsequent studies [1, 72] also observed a distinct Prevotella type, but suggested that the Bacteroides and Ruminococcus types instead represented two ends of a single continuum between abundant Bacteroidetes and abundant Firmicutes. An associated controlled feeding study associated the Prevotella enterotype with a high-carbohydrate diet, and Bacteroides with higher animal fat and protein consumption. Short-term enforced dietary shifts from high- fat and high-protein to high-carbohydrate (and vice versa) somewhat modified organismal abundances, but rarely changed the presence and absence of specific subjects' gut microbes .
The largest currently published study (528 Americans, Malawians, and Amerindians) found that diverse Prevotella OTUs, distinct from the proposed Prevotella enterotype, were highly predictive in distinguishing between the populations, as the genus was more prevalent in Amerindians and Malawians (who ate maize-based diets) than in Americans, who typically had higher levels of Bacteroides. Although Bacteroides and Prevotella were almost mutually exclusive in most subjects, a considerable minority of subjects harbored substantial populations of both genera, raising the intriguing possibility that even the Prevotella and Bacteroides enterotypes may not represent wholly discrete "energy minima" in gut community configurations .
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.