|Home | About | Journals | Submit | Contact Us | Français|
The microbiome is an abundance of microorganisms within a host (e.g. human microbiome). These microorganisms produce small molecules and metabolites that have been shown to affect and dictate the physiology of an individual. Functional knowledge of these molecules, often produced for communication or defense, will reveal the interplay between microbes and host in health and disease. The vast diversity in structure and function of microbiome-associated small molecules necessitate tools that will utilize multiple `-omics' strategies to understand the interactions within the human microbiome. This review discusses the importance of these investigations and the integration of current `-omics' technologies with tools established in natural product discovery in order to identify and characterize uncharacterized small molecules in the effort towards diagnostic modeling of the human microbiome.
The human microbiome is a diverse and dynamic collection of microorganisms that reside on, in, and around us. Current studies catalog microbes within healthy and non-healthy individuals and hypothesize roles these microbes play. Studies have shown a great diversity in microbial populations that inhabit a single individual as well as variances in these populations from person to person [1,2]. Additionally, an individual microbiome is dynamic, continuously changing with age, diet, and local environment [3,4]. There are unique microbial populations present between distinct areas of the body, even within specific regions of organs such as the intestine  and skin [6,7], enabling unique small molecules to influence local biological processes. Other current projects aim to determine the functional implications of our microbial co-inhabitants especially in health and disease. Microbiome-derived small molecule functions include nutrition and metabolism [8-10], vitamin synthesis [11,12], proper immune development and response [13-20], and protection from pathogens [21-23]. Germ-free and gnotobiotic animal studies provide insight into defined microbiome-host interactions. However, essential global information about the identities of small molecules and mechanisms of the biological effects occurring as a result of microbial populations in the host is lacking. To further our understanding of how small molecules dictate physiology and pathology, it is critical to understand the dynamics within and between microbial populations and between microbiome and host.
Microorganism-derived small molecules, typically produced for communication or survival, have antibiotic , cytotoxic , anti-fungal , anti-tumor , inflammatory, and other immunomodulary activities . Similarly, mammalian host-produced small molecules influence microbial populations and elicit production of microbiome-derived small molecules. Absence of these microbiome-derived or host-derived small molecules has been implicated as a causal agent in autoimmune and pathogenic diseases [29-31]. This supports hypotheses stating that humans and their microbiota have co-evolved and interactions via small molecules help maintain health [32,33]. However, few studies have aimed to identify the impact of these molecules that are involved in metabolic exchange. The small molecules responsible for biological effects within the microbiome represent a great diversity of chemical structures and functions. It is necessary to integrate current tools and develop more efficient methods to thoroughly identify and characterize microbiome-derived secreted molecules.
Integration of `-omics' and structural elucidation approaches, with tools such as mass spectrometry (MS) and nuclear magnetic resonance (NMR), will enable construction of a catalog of human microbiome small molecules. Genomics, proteomics, metabolomics, and other `-omics' studies catalog known molecules, while natural product workflows elucidate structures of novel molecules. Once a comprehensive catalog is established for the microbiome, a systems biology approach can then predict biological outcome. This review is not comprehensive in terms of the microbiome derived molecules that have been identified rather this review highlights the diversity of secreted small molecules within the human microbiome, their effects on health, the importance of global small molecule `-omics,' and the need to develop workflows to pursue such research. Investigating the functional roles of small molecules involved in microbiome-host and microbiome-microbiome interactions will improve understanding of pathology and treatment of microbiome-related diseases such as diabetes , obesity , allergies , cancers , and caries. Although a concept that may still be decades from maturation, an understanding of these metabolic exchange interactions may also take us one step closer to personalized medicine that caters to an individual's unique combination of microbial populations, expressed genes, and environment.
Within the human microbiome, yeasts, fungi, bacteria, and viruses produce small molecules with the capacity to affect cellular development, including intracellular, microbe-microbiome, microbe-host, and host-microbiome development. Small molecules originating from both the microbiota and the mammalian host exhibit a wide array of structural and functional diversity (Figure 1) ranging from lipopeptide antibiotics to molecules that compete for nutrients such as iron chelating siderophores to inflammation-inducing lipopolysaccharides, peptides, polycyclic hormones, and quorum-sensing molecules. The production of many of the microbe-derived molecules, which are encoded in gene clusters, is not directly involved in primary metabolism but instead provides a survival advantage. Hence, these molecules are often referred to as natural products, secondary metabolites, or adaptive metabolites. These molecules are produced within and affect specific biological and spatial niches, as illustrated by Figure 1. The next step towards understanding the influence of our microbiota on health is to understand the functions of each molecule. This review will highlight the functional and structural diversity with a few examples of microbiome- and human-derived small molecules.
Genome sequencing and genome mining of microbiome-derived organisms has revealed that many microbes produce secondary metabolites. The many gene clusters present are often unique to specific microbial strains and acquired via horizontal gene transfer, such as microcin B17 , colibactin , staphyloxanthin , and aureusimines [41,42]. Almost all of the ubiquitous microflora organisms such as Clostridia, Proprionibacteria, Burkholderia have the biosynthetic capacity to make molecules that interact with the environment and have been the focus of recent genome mining research [43-45]. Surprisingly, little is known about molecules produced by the microbiota and how they help control biology outside those of pathogenic interest, demonstrating a clear need for implementing means of detecting and characterizing these compounds.
Bacterial colonies use quorum sensing, self-generated chemical communication within one microbial population to coordinate behavior through regulation of gene expression, to form biofilms and inhabit a mammalian host. Two quorum-sensing molecules from Pseudonomas aeruginosa, a gram-negative opportunistic pathogen, also have immunomodulatory activities . Pseudomonas quinolone signal (PQS) alters biofilm development and regulates secretion of virulence factors , such as rhamnolipids, which defend P. aeruginosa against the host innate immune response . N-(3-oxododecanoyl)homoserine lactone (HSL) regulates P. aeruginosa virulence factors, suppresses interleukins from stimulated macrophages , and triggers production of chemokines in the host .
Like P. aeruginosa, Bacillus subtilis dedicates ~10% of its genome to the production of molecules that interact with its environment [51,52]. B. subtilis is a gram-positive bacterium commonly found on skin, mouth and in the intestines. The quorum-sensing pentapeptide pheromone competence and sporulation factor (CSF) from B. subtilis  is transported across the host intestinal epithelium, activates the p38 MAP kinase and protein kinase B/Akt cell survival pathways, and induces production of heat shock proteins to prevent oxidative injury to the intestinal epithelial cells .
Staphylococus aureus and Staphylococcus epidermidis are gram-positive bacteria typically located on epithelial surfaces of humans. However, injury to the epithelial layer potentially allows the Staphylococci to cause infection. The accessory gene regulator (agr) pheromones of Staphylococci, similar to the HSLs in P. aeruginosa, exhibit cross-inhibition of virulence factor expression between S. aureus and S. epidermidis . Aureusimines, secondary metabolites that also regulate virulence factor expression in S. aureus, are necessary for infectivity and virulence shown in mice . The antimicrobial delta toxin produced by S. epidermidis works in conjunction with lipids to form neutrophil extracellular traps in defense against invading pathogens, demonstrating a synergistic microbiome-host interaction .
A final example of microbe-host interactions is Helicobacter pylori, a gram-negative bacterium that can cause gastric ulcers and cancers, despite being part of the common stomach microflora in humans . H. pylori produces diethyl phthalate, a chemotactic factor that causes monocyte migration . Gastrin, a peptide hormone produced in the human stomach, illustrates the two-way nature of microbiome-host interactions. Gastrin triggers gastric acid secretion in response to an H. pylori-induced immune response and serves as a specific growth factor for H. pylori further encouraging H. pylori infection .
In addition to gastrin, other examples of host-derived small molecules that also interact with microbiota include glucagon-like peptide 1 (GLP-1), (nor)epinephrine, and estradiol. The production of GLP-1, an incretin hormone secreted by L cells in the small intestine in response to the presence of nutrients, has been shown to be influenced by host microflora in conventional and germ-free rats . (Nor)epinephrine is a catecholamine stress hormone that increases blood pressure that not only has anti-inflammatory activity within the host but can activate the expression of virulence factors in enterohemorrhagic E. coli [60,61]. The major estrogen predominantly found in females, estradiol, which functions in mammalian sex differentiation, has been shown to prevent apoptosis of male sperm cells . Estradiol also allows increased levels of endogenous vaginal bacteria that may lead to infection such as Lactobacilli, Bacteriodes, E. coli, and Staphylococcus saprophyticus  and was shown to affect the growth of Candidiasis enhancing infection in rats . Continuing research of how human-derived small molecules affect microbiota will greatly aid in understanding of their delicate interplay. Based on the genetic capacity of the microbiome, the majority of small molecules that affect the growth and behavior of the microbiota remain undiscovered. Integrated `-omics' based workflows need to be developed to capture the many classes of microbiota-associated molecules capable of altering growth, metabolism and communication such as antimicrobial microbiome- and host-derived peptides [65,66], hemolytic factors [67–70], and siderophores. While studies have shown that some human diseases are accompanied by changes in microbial populations, analyzing changes in the production of their small molecules is likely to yield great insight into the pathogenesis of the disease and for developing clinical and pharmacological treatments.
There are very few functional studies amidst the genomics, proteomics, and primary metabolomics studies that inventory microbiome communities. Studies of small molecules associated with the microbiome tend to be targeted to individual structural categories such as primary metabolites, peptides, or lipids, and global studies for characterizing new metabolites, or even a single metabolite, still pose a challenge for the `-omics' community . Development of workflows to study small molecules within the microbiome will require incorporation of established methods including the extensive structure elucidation experience from the natural product community and data mining expertise of other `-omics' communities. In order to elucidate structures and understand the functional roles of molecules, the `-omics' and natural product communities need to merge, and build a comprehensive database of small molecules in the human microbiome to better our understanding of human disease pathogenesis (Figure 2). This section of the review discusses the tools that may drive the integration of `-omics' and structural elucidation towards modeling multiple interactions within the human microbiome.
Though many tools contribute to `-omics' studies and natural product workflows, mass spectrometry (MS) and nuclear magnetic resonance (NMR) are the most powerful; they are already used for individual `-omics' approaches. MS is capable of high throughput identification of proteins, metabolites, and other types of molecules through the generation of important structural information with tandem MS. Like MS, NMR is often used for compound identification as well as to observe global metabolite changes. NMR is able to provide structural information, atomic connectivity and stereochemistry that MS cannot. Ongoing improvements of these two technologies have allowed for optimal data generation from a small sample size or from crude samples. Nanomolar NMR elucidates structures from as little as one nanomole of material , although it is anticipated that this will be in the picomolar range in the near future, while some MS methods can evaluate samples in the sub-picomolar range. NMR of crude mixtures has been previously used for a variety of samples in fungal , spider venom , and single-insect  studies, and is advantageous since it eliminates the need to separate molecules prior to analysis [75,76]. Combining liquid chromatography (LC) with MS or NMR allows for the separation of molecules preceding analysis, resulting in improved signal intensity. Additionally, spatial localization of molecules can be characterized using imaging mass spectrometry (IMS) , often giving important information of signal distribution in tissue samples  or in interactions grown on agar [79,80]. Each tool has an invaluable role in `-omics' studies and has great potential in advancing microbiome-based workflows in studies that will both catalog known molecules and target uncharacterized molecules. Even then, due to the time and effort it takes, structure elucidation is to be done only with pre-prioritized molecules .
Genomics is the most prominent tool to define the make-up and species dynamics of the microbiome. The Human Microbiome Project uses metagenomics to catalog which species are present using 16S rRNA sequencing, and more recently, deep sequencing  and sequencing plasmids . Once the complete genome of an organism is sequenced, genome-mining approaches can be used to predict natural products and to discover novel adaptive metabolites such as thiopeptides, non-ribosomal peptides, acid activating t-RNA synthetase homolog genes , thiazole and oxazole modified microcins (TOMMs), lantibiotics and similar peptides. Generation of annotated genomes helps provide a basis for predictive algorithms capable of mining unannotated genomes for new secondary metabolites, which can drastically improve the speed at which new molecules can be targeted and characterized.
With the ability to study translated proteins in an automated and high-throughput fashion, proteomics provides established tools that will contribute to studies of the microbiome. Based on unique peptidic and proteomic profiles, peptidomics and proteomics can identify species within a complex population of different organisms, albeit inferior to genomics due to slower speed and noisier data. The large number of organisms also increases the probability of random and incorrect identifications; many previous proteomic studies have been incorrectly interpreted [83,84] and the proteomics field has been plagued with irreproducibility . Other challenges include unusual post-translational modifications, non-proteinogenic amino acids, and structural constraints as found in cyclic peptides. Understanding the human microbiome thus represents an enormous MS and informatics opportunity for development of tools that can capture the peptidic small molecules of the microbiome. Certainly to overcome the large database challenge, de novo sequencing and spectral networks  may need to be incorporated into the proteomics branch of the `-omics' workflow development.
The basis of metabolomics lies in the microbial role within human metabolism, which can affect the small molecule profile and be indicative of differences in bacterial communities . The Human Metabolome Project is currently building and expanding the metabolome database with useful information including the chemical, clinical, and functional data of human metabolites (www.hmdb.ca). A metabolomics study of the gut microbiome in pre- and post-ileostomy closure patients found that pre-closure metabolic profiles differ from post-closure, as do the bacterial communities . Building on metabolomics and their respective databases, metabonomics investigates metabolism dynamics, particularly systematic temporal changes .
Metabolic profiling of microorganisms is done using LC-MS, IMS, thin layer chromatography (TLC), and NMR. Tandem MS approaches provide useful sequence tag information on peptidic metabolites, which can be compared with known and predicted tandem MS spectra toward identification. However, as mentioned earlier, irregular post-translational modifications and lack of mass fragment assignments make structure verification difficult. IMS can also be used to study metabolomics by giving spatial information on the distribution of regionally abundant metabolites across the surface of a sample. This spatial information can be used to form hypotheses on the function and mechanistic action of observed metabolites, especially with interacting populations in tissue or on solid nutrient media [79,89,90]. IMS also aids in `purifying' the sample via identification of interesting compounds based on unique spatial distribution, which can then be targeted for purification and subsequent characterization using biological assays . While cataloging metabolites and changes within metabolism are important to understand the cellular effects of microbiome organisms, these techniques should also incorporate structural elucidation of unknown molecules through natural product workflows as metabolome databases, include, only a small number of secreted molecules.
Genome mining has long been used to predict molecular structures  and is still used to successfully characterize novel non-ribosomal peptide synthetase (NRPS), polyketide synthase (PKS), terpenoid, and other natural products. Genome mining searches for genes or gene clusters that encode enzymes involved in the biosynthesis of natural products based on sequence alignment with other characterized enzymes involved in natural product biosynthesis. Genome screening programs such as ClustScan, NRPS-PKS and “NP.searcher” are used to predict locations of gene clusters and the structure of their putative products . Success of genome mining depends on the availability of complete microbial genomes, thus will prove especially powerful when used in conjunction with metagenomics, as sequenced genomes and plasmids can be automatically fed into a pipeline to be mined for secondary metabolite production . The tools currently in place provide a good starting point for data mining; however, improvements are still needed in predictive software for adaptive metabolites including a ribosomally-encoded peptide predictor, incorporation of 6-frame translation into genome mining searches, and consolidation of metabolite and small molecule databases.
Construction of microbiome-derived small molecule libraries from `-omics' studies and natural product workflows should reflect the format used in the Human Metabolome Database by including the chemical, clinical, and biochemical data of each molecule including known locations within the human body. Catalogs may also incorporate biological information about a microbe, such as how and under which conditions metabolite expression is regulated, the native environment, microbiome- and host-interactions, and annotation of predicted gene clusters. However, as with the cataloging of bacterial species within the human body, cataloging the many small molecules secreted by the microbiota is an intermediary step. Tools capable of characterizing the biological roles of small molecules need to be developed, including universal high throughput bioassay screens for toxicity, antimicrobial, antifungal and anticancer activities, among other assays . Once bioactivity of one or a combination of small molecules is verified in vitro, structure-activity relationships should be determined. Therapeutically relevant compounds can also be modified based on this information to produce potent analogs with the least amount of toxicity. Quantitation of each molecule and its effects when tested alone, and in combination with other molecules, will also be key pieces of information when establishing a database for computational prediction models . An effective catalog of molecules with chemical, biological, and interaction information will allow for network mapping and systems biology for modeling of the human microbiome.
Mapping small molecule networks within the human microbiome in silico will lead to novel systems biology approaches  to study and generate predictive microbiome-host interactions, which will require a strong bioinformatics infrastructure similar to ExPASy (Expert Protein Analysis System) proteomics server of the Swiss Institute of Bioinformatics (SIB) and RCSB PDB (Research Collaboratory for Structural Bioinformatics Protein Data Bank) for proteins. Libraries will include microbiome-derived small molecules, their networks, biosynthesis, primary functions, and biological consequences. Integration of natural products databases and libraries, metabolomics databases, microbial profiles, and peptidomics/proteomics databases into a searchable model network of small molecules between microbiome and host can then lead to prediction, diagnosis, and treatment of microbiome-associated diseases .
The diverse assemblies of microbes that inhabit each individual exhibit tremendous influence over the health of the mammalian host. With changes in an individual microbiome observed in conjunction with the onset of disease, immune system development, uptake of dietary nutrients and aging, understanding the role of these co-inhabitants may provide key insight into improving our quality of life. Since secreted small molecules produced by both the host and microbiota serve to control biology, identifying and characterizing these secondary metabolites will provide information on the nature of the chemical interactions that allow microbiota to influence their human hosts. Despite long standing initiatives to study the microbiome, tools and workflows necessary to study the microbiome-derived small molecules in a global fashion are severely lacking. As analytical, biological and bioinformatics workflows develop, new paradigms governing human health and treatment can be used to design personalized medical treatment and diets based on a person's individual microbiota.
This work was supported by Beckman Foundation, V-foundation, Hearst foundation, and National Institute of General Medical Sciences Grant NIH GM086283, GM094802 (P.C.D.). JYY is supported by the Ruth L. Kirschstein National Research Service Award, NIH 1 T32 EB009380-01. JDW is supported by the NIH Molecular Biophysics Training Grant, NIH training grant number 5T32GM008326-20. JRK is supported by the Department of Chemistry & Biochemistry, UCSD.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
Papers of particular interest, published within the period of review (2008–2010), have been highlighted as:
• of special interest
•• of outstanding interest