|Home | About | Journals | Submit | Contact Us | Français|
Culture-independent microbiological technologies that interrogate complex microbial populations without prior axenic culture, coupled with high-throughput DNA sequencing, have revolutionized the scale, speed, and economics of microbial ecological studies. Their application to the medical realm has lead to a highly productive merger of clinical, experimental, and environmental microbiology. The functional roles played by members of the human microbiota are being actively explored through experimental manipulation of animal model systems and studies of human populations. In concert, these studies have appreciably expanded our understanding of the composition and dynamics of human-associated microbial communities (microbiota). Of note, several human diseases have been linked to alterations in the composition of resident microbial communities, so-called dysbiosis . However, how changes in microbial communities contribute to disease etiology remains poorly defined. Correlation of microbial composition represents integration of only two datasets (phenotype and microbial composition). This article explores strategies for merging the human microbiome data with multiple additional datasets (e.g. host single nucleotide polymorphisms [SNP] and host gene expression) and for integrating patient-based data with results from experimental animal models to gain deeper understanding of how host-microbe interactions impact disease.
Human-associated microbial communities, particularly those of the gastrointestinal (GI) tract, provide myriad beneficial services to ourselves. For instance, gut microbes transform otherwise indigestible plant polysaccharides into absorbable short-chain fatty acids (SCFA)  and participate in the development and maintenance of immune homeostasis [3, 4]. Disruption of any of these mutualistic relationships through shifts in microbial community composition (i.e. dysbiosis ) could compromise human health and contribute to disease onset, progression, or duration. Indeed, recent applications of culture-independent sequencing technologies to studies of human health and disease have revealed dysbioses associated with diverse conditions, including antibiotic-associated diarrhea [5, 6], bacterial vaginosis [7–9], celiac disease , colorectal cancer [11, 12], cystic fibrosis [13, 14], esophageal disease , Crohn’s disease and ulcerative colitis (collectively referred to as inflammatory bowel diseases [IBD]) [16–22], irritable bowl syndrome [23–26], necrotizing enterocolitis , non-bacterial prostatitis [28, 29], pre-term birth , obesity [31–33], pouchitis [34–36], and psoriasis .
What has been lacking in studies of human dysbioses is determination of the biological and clinical significance of these community imbalances. Are the changes in enteric microbiota that are observed in IBD simply a consequence of chronic inflammation and its treatment or are they necessary determinants of initiation and/or perpetuation of pathogenesis? Figure 1 presents four possible causal relationships between disease and dysbiosis. First, dysbiosis could be a primary trigger that leads to pathogenesis (Figure 1, Path A). Although the initial factors that disrupt the commensal microbiota are not well defined, their effects are mediated through induction of detrimental immune responses or compromised mucosal barrier function. For example, depletion of commensal bacteria can lead to increased populations of potentially pathogenic organisms, such as occurs with toxin-secreting Clostridium difficile proliferation following antibiotic treatment . Restoring the normal microbiota would, in this scenario, be expected to lessen pathogenesis. Second, dysbiosis could arise in parallel with pathogenesis (Figure 1, Path B), but not serve as a causal factor of disease. In this case, treating the dysbiosis would not ameliorate disease. Similarly (Figure 1, Path C), the pathological condition itself, or its treatment, could cause a secondary shift in microbial community structure. For example, remodeling of the intestinal mucosal milieu following prolonged inflammation could significantly alter both the intestinal environment and the microbial inhabitants of this niche. Finally, even if not a primary cause of disease (Figure 1, Path D), disruption of the commensal microbial community and the mutualistic functions it contributes to the host could contribute to the duration or severity of a disease state. For example, imbalances in the microbiota could also lead to metabolic alterations (e.g. decrease in luminal SCFA) that in turn compromise the protective functions of the epithelial barrier . Alternatively, these abnormal bacteria could secondarily invade a disrupted mucosal barrier or provide antigens and Toll-like receptor ligands that further stimulate adaptive and innate immune responses thereby augmenting and perpetuating an ongoing inflammatory response [39–41]. In these situations, remediating the dysbiosis should provide palliative relief, but likely would not effect a cure, if the primary etiological factor(s) are not addressed. That many of the syndromes associated with dysbiosis are chronic suggests that loss of the normal functions provided by a commensal community, even if not the ultimate causative agent, could significantly prolong the disease state and complicate its resolution.
Delineating the causes and etiological consequences of disease-associated dysbioses remains a crucial challenge in studies of the human microbiota. As with other diseases of uncertain microbial etiology, fulfilling Koch’s postulates in the context of dysbiosis is problematic. First, Koch’s postulates are predicated on the establishment of a parasitic relationship between the host and etiologic agent, whereas some dysbiosis-associated pathologies might instead arise from loss of mutualistic interactions. Second, whereas the defining feature of Koch’s postulates is the demand for isolation of a single pathogen in pure culture, dysbiosis is by definition a community phenomenon. The causal significance of dysbiosis could instead be more effectively evaluated through modifications of Koch’s postulates that have been set forth in the era of risk-factor epidemiology (e.g. those of Hill  and Evans ) and molecular microbiology (e.g. those of Fredricks and Relman ). These guidelines recognize that a constellation of necessary, but not sufficient, causal relationships often is the hallmark of diseases that are chronic, multi-factorial, and/or develop with long latencies. Moreover, the Fredricks-Relman guidelines embrace the utility of molecular-based technologies for detecting microorganisms in the absence of culture, a key feature given that syntrophic and symbiotic relationships among microbial community members are likely to limit the reductionist approach of axenic culture. Thus, the pathogenic capacity of a microbial community as a whole could be examined in the context of these revised criteria.
In evaluating causal relationships, both Hill’s criteria and the Fredricks-Relman guidelines consider the strength, consistency, and specificity of association between the occurrence of a putative pathogen and a disease, as well as their dose-response relationship. The degree to which these criteria are satisfied typically provides the primary evidence in support of a proposed association between dysbiosis and disease (e.g. ). Hill’s criterion of biological plausibility, though not emphasized by the author , is supported in the case of dysbiosis by continuing, intensive work on the biochemical, physiological, immunological, and genetic bases of host-microbiota interactions [40, 41, 45–49]. However critical to establishing correlations, these criteria would not necessarily differentiate among the causal relationships outlined in Figure 1.
Our contention is that three additional Hill’s criteria – namely the coherence, temporality, and experimental support for causality – are the critical factors required to understand the causal role of dysbiosis. Coherence relates to the fit between a hypothesized dysbiosis-disease causal relationship and all other knowledge concerning the disease. This is essentially a hypothesis generating approach to determining the relationship between dysbiosis and disease. The challenge in patient based research is that the number of potential variables (genetic and environmental factors) to consider is extremely large compared with the number of samples that can be analyzed. Addressing temporality could require prospective collection of samples prior to the onset of disease (e.g. longitudinal follow-up of unaffected relatives of patients). The generation of genetic experimental animal models provides extremely useful tools for testing causality. However, differences between the experimental animal model and humans present a potential challenge. The ultimate goal of these strands of research is to determine whether (i) there are interventions that will alter dysbiosis and (ii) whether these interventions alter the course of the disease. The following sections describe these criteria in greater detail, as exemplified by an interdependent approach to delineate the role of dysbiosis in human IBD.
Determining whether enteric dysbiosis modifies the incidence and progression of IBD will require integration and correlation of large-scale surveys of microbial communities with rigorously characterized human genotypes and molecular phenotypes (microarray data, proteomic data, and histologic imaging data) in both Crohn’s disease and non-Crohn’s disease cohorts (Figure 2). Moreover, collection of a very large number of well phenotyped clinical specimens is necessary to accurately link potentially subtle patterns of altered enteric microbiota with host response and disease occurrence. Subsequent mechanistic studies in animal models, including selective colonization of gnotobiotic mice with bacterial species identified by the broad human surveys, combined with targeted in vitro experiments can determine causal relationships and further define mechanisms of action.
The observation that a subset of IBD patients harbor abnormal enteric microbiota, while the microbiota of other IBD patients are apparently ‘normal’ raises several questions . First, do these differences persist within an individual or do all IBD patients experience periodic flares of dysbiosis? Does dysbiosis arise from prolonged treatment with antibiotics or other agents that could alter the microbial habitat of the intestinal tract? Do host genetics, particularly the presence of IBD risk alleles affect the incidence or prevalence of dysbiosis in individuals, regardless of disease state? The correlation of abnormal microbiota with a younger age of surgery  was intriguing because alleles of NOD2/CARD15 and ATG16L1 that increase the risk of Crohn’s disease also are associated with younger age of surgery. Host genetic factors that affect innate immunity, such as the NOD2 and ATG16L1 genes, could affect microbial composition, particularly in the ileum. The three prevalent risk alleles of NOD2 (Leu1007fs, R702W, and G908R) are associated with the highest relative risks of ileal Crohn’s disease among the ~100 IBD susceptibility loci identified thus far.
To determine whether human genetic factors are associated with dysbiosis, NOD2 and ATG16L1 genotypic data were integrated with a previously published 16S rRNA sequence dataset . This analysis revealed associations between alterations in intestine-associated microbial composition and disease phenotype, NOD2, and ATG16L1 genotype . Analysis of an independent set of disease unaffected ileum samples collected from patients with three non-overlapping IBD phenotypes undergoing initial surgery: (i) ileal Crohn’s disease, (ii) isolated colitis, and (iii) non-IBD controls. Multivariate analysis of the 16S rRNA sequence dataset selected disease phenotype, C. difficile, and NOD2 genotype as significantly associated with shifts in grossly unaffected ileum-associated microbial composition. In these analyses, potential confounding variables such as obesity and IBD medications were also included. Disease phenotype and NOD2 genotype were also selected as significantly associated with shifts in the relative abundance of the Clostridium coccoides-Eubacterium rectales group measured by PCR.
Thus, NOD2 and ATG16L1 alleles have determining effects on the microbiota that are independent of the occurrence of overt disease. An important implication of this result is that the dysbiosis that has been widely reported in connection with Crohn’s disease  is not solely the result of environmental effects such as treatment history or diet. Interestingly, the abundance of an individual enteric genus is rarely significantly correlated with both NOD2 and ATG16L1 genotype, suggesting complex interactions between host genotype and microbial community composition . Moreover, several genera are significantly associated with disease phenotype, but not with NOD2 or ATG16L1 genotype, suggesting either that disease itself influenced the relative frequencies of these genera or that other genetic determinants were involved. Therefore, further associations between other Crohn’s disease polymorphisms and environmental factors likely will emerge as study populations expand.
An association between enteric microbial profiles and genetic predisposition suggests that dysbiosis arises either as a consequence of direct genetic effects on microbial composition, perhaps through altered Paneth cell function [50–52] or as a direct result of the pathogenic process. Thus, the question of whether dysbiosis contributes to Crohn’s disease pathogenesis or is an innocuous byproduct remains to be settled. Furthermore, how mucosal barrier dysfunction or inflammation per se could lead to dysbiosis also is not clear. The plethora of luminal commensal bacteria are largely tolerated by the mucosal immune system and ignored by the systemic immune systems of normal hosts [49, 53, 54], but are essential drivers of pathogenic mucosal and systemic inflammatory responses in genetically predisposed hosts [39–41, 55]. Enteric microbial communities play a critical role in this process by stimulating the development of GI lymphoid tissues [56, 57], fortifying the physical epithelial barrier , and regulating the quality and magnitude of the mucosal immune response [3, 4] to commensal bacteria and potential pathogens. Shifts in microbial community composition that arise from dysfunctional innate mucosal immunity suggest a selective disadvantage to those microbial groups that serve to nurture gut barrier integrity. To fully understand the mechanism(s) by which host genetic factors influence microbial composition and the functional consequences of the changes in microbial composition it is critical that the microbiome data be linked also with host and bacterial transcriptomics [59, 60], proteomics [61–63], and metabolomics [64–66].
The strongest evidence that dysbiosis contributes to human disease could be obtained through double-blind, randomized controlled experiments with agents that normalize dysbiotic profiles in individuals with disease or create dysbiosis in normal individuals. Under these conditions, the causal relations between exposure to dysbiosis and subsequent development of pathology then can be defined. However, ethical and practical considerations (how to artificially create a dysbiosis that precisely matches that of a particular disease, how to sustain microbial alterations with antibiotics, probiotics and prebiotics and how to reproduce environmental triggers that likely initiate inflammatory processes [40, 52]) limit this approach. Consequently, quasi-experimental studies in which dysbiosis is induced as a consequence of medical (e.g. C. difficile-associated diarrhea resulting from antibiotic exposure) or surgical procedures (e.g. pouchitis following ileal-pouch anal anastomosis ) likely will be the dominant means of manipulating human microbiota prior to disease occurrence.
In parallel with quasi-experimental studies of humans, well-designed direct experiments with animal models of disease can contribute important insights into the timing and mechanistic basis of disease pathophysiology in relation to altered microbiota . Multiple models of experimental colitis exhibit dysbiosis similar to that seen in human IBD, with contraction of several Clostridium subsets and expansion of Enterobacteriaceae, including Escherichia coli [68, 69]. Studies in gnotobiotic mice and rats implicate commensal enteric bacteria in the pathogenesis of immune-mediated chronic intestinal inflammation [67, 70, 71] and demonstrate that individual bacterial species relevant to the dysbiosis of human IBD have differential abilities to induce or prevent experimental enterocolitis [70, 72]. Of note however, not all species that are altered in abundance in experimental colitis can elicit pathogenesis, whereas those that are sufficient to cause colitis may not be affected by disease state . Therefore, dysbiosis might best be viewed as a starting point for careful experimentation, rather than an endpoint.
Host genetic background is a determinant both of enteric bacterial profiles and inflammatory sequellae to individual bacterial species. For example, NOD2-deficient mice display altered mucosally-associated bacteria and Bacteroides spp. composition in this model . These differences were not, however, detected in a comparison of fecal microbiota, suggesting that the effect of NOD2 deficiency on microbial composition was restricted to the ileum, and perhaps reflect changes in mucosa-adherent communities . Similarly, germ-free (sterile) IL-10 deficient mice develop proximal colitis when mono-colonized with either several adherent/invasive E. coli strains, including one human ileal Crohn’s disease isolate, and aggressive distal colitis and duodenal inflammation when selectively colonized with Enterococcus faecalis, but no disease with Bacteroides vulgatus .
Both traditional probiotic bacterial species (Lactobacillus spp. and Bifidobacterium spp.) as well as a commensal species that is decreased in human Crohn’s disease (Faecalibacterium prausnitzii) can attenuate experimental colitis . Reports of transmission of colitis [74, 75], obesity , and metabolic syndrome  by fecal transfer to normal recipients strongly implicate dysbiosis as a primary etiologic factor in disease pathogenesis, although secondary changes in luminal microbiota following nonspecific intestinal inflammation or injury have been documented. An exciting observation that a common enteric viral pathogen, norovirus, can induce functional and phenotypic alterations in mice with decreased ATG16L1 expression offers proof of principle that environmental triggers can modulate genetically-defined disease susceptibility . Recent evidence that human fecal microbial composition can be conserved in gnotobiotic mouse recipients provides an elegant opportunity to explore the pathogenicity of dysbiotic bacteria in human IBD patients in murine models . These selective constitution studies will help define mechanisms of the pathogenicity of human bacterial profiles and microbial subsets in recipients by modeling interactions of human genes, environmental stimuli and human microbiota.
In addition to experimental animal and human microbial manipulation approaches, longitudinal observational studies represent a critical means of assessing the causal relationship between dysbiosis and disease occurrence. In principle, these studies would determine whether a particular exposure (e.g. alterations in microbiota) precedes development of overt disease. Despite their clear utility, few if any such trials have been conducted to date, although technological innovations in studying human microbiota now make large-scale cohort and/or case-control studies feasible. However, the chronic nature and the protracted preclinical phase of many dysbiosis-associated diseases could complicate interpretation of results, because timing of disease onset is imprecise in these cases, especially when human genetic loci help determine disease susceptibility.
Although the technology of culture-independent metagenomics is well established, development of statistical tools for analyzing microbial community organization has lagged behind the evolution of DNA sequencing technologies. An important component of this work will be continued development of computational and statistical methods for analysis of host-microbe dynamics. Moreover, robust methods are needed to integrate and mine data generated through myriad ‘omics technologies. As Sauer et al. have noted in regards to genetic analysis "The (traditional) reductionist approach has successfully identified most of the components and many of the interactions but … the pluralism of causes and effects in biological networks is better addressed by observing, through quantitative measures, multiple components simultaneously and by rigorous data integration with mathematical models" . A potentially promising approach could be the development of statistical methods for analyzing microbiome data from a biological causal pathway perspective [80, 81]. This analytical methodology is based on structural equations modeling (SEM), a set of interrelated regression equations with random independent as well as dependent variables that allow formulation and testing of directional causal pathway hypotheses . Although widely used in social, economic, and behavioral sciences (e.g. [80, 83–90]), SEM techniques have been applied only sparingly to ecological data (e.g. [91–93]), despite their original proposal by the American geneticist Dr. Sewall Wright . Given the increasing quantitative nature of biological studies and the availability of multimodality genetic, genomic, proteomic, and other relevant morphological and physiological data, SEM is one of the most suitable statistical modeling and analysis tools for the study of the biological networks from a systems biology point of view.
In the example shown in Figure 3, the aim is to study the interactions among host genotype, gene/protein expression, physiology, morphology, gastro-intestinal microbiome and host phenotype using structural equation modeling. The composition of the microbiome is treated as an unknown (latent) variable that can be modeled through multiple observed variables using a combination of experimental methods (e.g. different sequencing platforms, quantitative PCR, and microarrays). Each node with an incoming arrow can be analyzed through a regression equation with the given node as the response variable and the originating nodes of the incoming arrow(s) as the independent variables (regressors). If the node is categorical, such as phenotype, the resulting equation will be a logistic regression. A numerical node such as microbiome, will lead to a linear regression equation. The path coefficients (α, β, and λ; Figure 3) are simply the usual regression coefficients. These regression equations can be analyzed individually, as commonly done, in a sectional approach. However, a unified simultaneous analysis of this entire equation system is the optimal solution in terms of the power and accuracy of the resulting statistical inference (K. Sharpe, PhD thesis, State University of New York at Stony Brook, 2010).
SEMs can be readily extended to handle hierarchical (i.e. longitudinal) datasets using methods developed for magnetic resonance imaging (MRI) time series data . Moreover, future work potentially can produce a powerful, unified SEM method that enables the simultaneous analysis of a pathway network with both continuous (possibly non-normal) and categorical dependent variables by extending the newly emerging generalized linear latent and mixed models (GLLAMM; ). These methods assess statistical relationships among complex datasets with mixed data types (i.e. categorical and continuous measures), for instance allowing for simultaneous correlation of human gene expression and microbiome data with disease outcomes, such as presence of a particular disease. Although these methods can not prove assertions about causality, they provide a robust framework for assessing and potentially rejecting such hypotheses. In the clinical context, the ability to model the interactions between complex datasets and multi-factorial disease states holds great promise to both elucidate fundamental features of etiopathogenesis and discover novel biomarkers of disease onset, severity, and progression.
Rather than discovering exotic new pathogens, the application of microbial metagenomics to human health has instead provided compelling, though not conclusive, evidence that disruption of host-microbe mutualism might be central to a variety of pathologies. In these cases microbial communities, not individual parasitic microorganisms, could play the role of pathogen . However, other than in exceptional cases we are not likely to observe disease-associated dysbioses that neatly satisfy all of the criteria that have been proposed to prove causality [42–44]. Nevertheless, these guidelines provide a logical framework for assessing disease etiology that reflects the technological and epistemological developments that have sprung from Koch’s groundwork in clinical microbiology. These investigations have the potential to identify therapeutic targets that can normalize compositional and functional microbial alterations that either primarily or secondarily contribute to disease activity and intensity.
This work was supported by Mucosal and Vaccine Research Colorado (DNF), Crohns and Colitis Foundation of America (RBS and EL), Helmsley Foundation (RBS) and NIH grants HG005964 (DNF), DK053347 (RBS), RR018603 (RBS), DK034987 (RBS) and HD059527 (EL). We thank Norman R. Pace for his conception of the microbial community as potential pathogen and Gail Teitzel for her editorial insight.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.