|Home | About | Journals | Submit | Contact Us | Français|
Conceived and designed the experiments: B. White B. Wilson CJY SMT KEN. Performed the experiments: CJY SMT AVU MEBM MT MG SL MAS JSW MC. Analyzed the data: CJY SMT AVU AG MH DJC MAS JSW MC. Contributed reagents/materials/analysis tools: B. White B. Wilson CJY KEN SRL RS MAS JSW. Wrote the paper: CJY B. White B. Wilson SMT AVU SRL RS AG KEN MEBM.
Bacterial vaginosis (BV) is the most common vaginal disorder of reproductive-age women. Yet the cause of BV has not been established. To uncover key determinants of BV, we employed a multi-omic, systems-biology approach, including both deep 16S rRNA gene-based sequencing and metabolomics of lavage samples from 36 women. These women varied demographically, behaviorally, and in terms of health status and symptoms.
16S rRNA gene-based community composition profiles reflected Nugent scores, but not Amsel criteria. In contrast, metabolomic profiles were markedly more concordant with Amsel criteria. Metabolomic profiles revealed two distinct symptomatic BV types (SBVI and SBVII) with similar characteristics that indicated disruption of epithelial integrity, but each type was correlated to the presence of different microbial taxa and metabolites, as well as to different host behaviors. The characteristic odor associated with BV was linked to increases in putrescine and cadaverine, which were both linked to Dialister spp. Additional correlations were seen with the presence of discharge, 2-methyl-2-hydroxybutanoic acid, and Mobiluncus spp., and with pain, diethylene glycol and Gardnerella spp.
The results not only provide useful diagnostic biomarkers, but also may ultimately provide much needed insight into the determinants of BV.
Mechanistic understandings of a number of diseases that are believed to have microbial axes have escaped conventional biological analyses –. The application of integrative systems biology approaches exploiting multiple data sources holds much promise in developing a fuller understanding of these diseases , .
Bacterial vaginosis (BV) is the most common vaginal disorder in reproductive age women. BV is characterized by a creamy grey discharge containing clue cells that typically has a pH>4.5 and an amine or “fishy” odor , . BV is discomforting to women, and increases the risks of infertility, pre-term birth and the acquisition of sexually transmitted infections including HIV  making it of particular clinical interest. The underlying mechanism that causes BV remains unknown.
The vaginal microbiomes of healthy reproductive age women are often, but not always, dominated by lactic acid-producing Lactobacillus species , . Women with BV commonly exhibit reductions in lactobacilli. Gardnerella vaginalis (formerly Haemophilus vaginalis or Corynebacterium vaginale) was first proposed as a bacterial cause of BV after it was found to be almost universally cultured from BV-diagnosed women . Gram-stain-based methods described by Nugent  and Spiegel  have been used to delineate BV from other vaginal disorders or infections (vaginal candidiasis/trichomoniasis, gonorrhoeae, chlamydia). These methods apply a numbered ranking system from 1–10, where an increase reflects reductions in the relative proportions of Lactobacillus morphotypes and increases in Gardnerella and other Gram-variable rod shaped morphotypes. The initial method of Spiegel  was evaluated by Nugent  to be only slightly better than chance in predicting BV, while Nugent and colleagues reported an almost 20% false positive rate in describing an improved approach. Recent studies suggest just 16%–37% of women with a Nugent score indicative of BV actually show symptoms of the disease , . A likely confounding factor of these approaches is that reductions in lactobacilli are also occasionally seen in asymptomatic individuals ,  and G. vaginalis is commonly found in the vaginal tract, regardless of disease symptoms , , –. A more definitive diagnosis is based on the Amsel criteria , which diagnoses BV based on the presence of three of the four symptoms (discharge, odor, pH>4.5 or clue cells). Because of the limitations associated with the Nugent method, Amsel criteria are more commonly employed in a clinical setting and CDC guidelines preclude treatment without reported or observed symptomology. Despite this, Nugent score continues to be used in research as a diagnostic tool and this may have confounded our current understanding of BV.
Prior to the development of Gram stain-based methods, Spiegel and colleagues  favored volatile fatty acid (VFA) profiles of women exhibiting symptoms of BV and asymptomatic women, noting a strong correlation between the succinate to lactate ratio and disease state. However at the time, the equipment necessary to evaluate VFAs was not commonly available in many clinical laboratories .
Here we investigated the potential for modern biological techniques, including 16S rRNA gene-directed evaluations of community composition, and metabolomic profiles to adequately distinguish patients with BV from those without BV. Finally, we combined 16S rRNA gene and metabolite profiles with the patients' metadata and used a network-based approach to discern potential, but previously unrecognized, determinants of BV, leading to new and very testable hypotheses.
We began our multi-omic approach by performing deep sequencing using 454 FLX-titanium technology of the V1-V3 region of 16S rRNA genes amplified from vaginal lavage samples collected from 36 pre-menopausal women of reproductive age. The women sampled included five Amsel-positive women with a high Nugent score (≥7) and three Amsel-positive women with a low (<3; n=2) or moderate (4–6; n=1) Nugent score. The 28 Amsel-negative women also included low (n=17), moderate (n=7) and high (n=4) Nugent scoring representatives. These women represented a broad range of self-reported age (20–45), ethnicity (African American, Asian American, and Caucasian), sexual behaviors, and hygiene practices (Table 1).
Richness (average species observed per 5000 reads (SOBs5K): high Nugent score median =920.5 vs. low-moderate Nugent score median=375.1; Fig. 1A) and diversity analyses using both Shannon's (high Nugent score median=4.79 vs. low-moderate Nugent score median=3.10; Fig. 1C) and Simpson's (high Nugent score median=0.029 vs. low-moderate Nugent score median=0.18; data not shown) diversity indexes indicated a significant increase in diversity (p<0.0003) and richness (p=0.01) in women with a Nugent score≥7 (Fig. 1; Table 1). The same analyses (symptomatic BV median=441.3 vs asymptomatic BV median=689.1, SOBs5K; symptomatic BV median=4.12 vs asymptomatic BV median=3.48, Shannon's index; symptomatic BV median=0.047 vs asymptomatic BV median=0.059, Simpson's index) also indicated a general, though not significant increase in richness (p>0.2) and diversity (p=0.1) in women displaying symptoms of BV (Fig. 1; Table 1).
We examined the genus-level composition of each sample (Fig. 2 and Table S1). 16S rRNA gene reads assigned to lactobacilli (p>0.7) were found in all samples ranging in proportions from 0.5% to >99.9%. We observed a marginal decrease in the maximum, median and average proportion of lactobacilli 16S rRNA gene reads in samples from women determined by the Amsel criteria  as having BV compared to samples from asymptomatic women, but it was not significant (p=0.97). Equally, no significant differences were observed in the proportion of lactobacilli 16S rRNA gene reads among samples determined to have high, moderate or low Nugent scores (p>0.55) .
16S rRNA gene reads assigned to the genus Gardnerella were found in samples from 21 of the 36 subjects with proportions of Gardnerella ranging from 0.000006%–38% (median 0.74%) of reads. Gardnerella 16S rRNA reads were more common in women assessed by the Amsel criteria to have BV (88%) than others (50%) and in women found to have a high Nugent score (100%) compared to those with moderate (62.5%) or low (37%) Nugent scores. Gardnerella were significantly more abundant in samples presenting with high Nugent scores relative to those with low Nugent scores (p=0.05), but were not significantly different among samples assessed by Amsel criteria  to have BV or not (p=0.4).
Dialister was the only other genus-level taxon found to be significantly more abundant in samples with high Nugent scores relative to those with low or moderate Nugent scores (p=0.05). However, again the proportions of 16S rRNA reads assigned to this genus were not significantly different among samples assessed by Amsel criteria to have BV or not.
Ravel et al.  recently reported five distinct vaginal microbial biotypes, characterized by the dominance of Lactobacillus crispatus (I), L. gasseri (II), L. iners (III), L. jensenii (V), or an increased proportion of other strictly anaerobic bacteria (IV). No read within our data set was assigned by speciateIT (algorithm described in ) to L. iners, the dominant species of vaginal biotype III communities . Consistent with this, a blast search using the 16S rRNA gene sequence of the L. iners type strain (CIP109878; Genbank accession HE573916) as the query and our dataset as a database yielded no match within 3% sequence ID. This was unexpected as L. iners had previously been detected in four of the eight samples from this dataset that were previously evaluated using full-length clone libraries . We therefore used L. iners-specific primers to qualitatively test for L. iners among our samples. Using this approach L. iners was detected in 17 of the 36 samples (Fig. S1). We proceeded to quantitate L. iners 16S rRNA genes in these samples using qPCR, with the results presented in Table 2. Even when accounting for L. iners our total numbers of lactobacilli exceeded those collectively contributed from L. crispatus, L. gasseri, L. iners, and L. jensenii. This may indicate the common presence of other Lactobacilli species, or may simply reflect reads that could not be accurately classified to the species level.
We then evaluated the vaginal biotypes of our cohort using the same methodology as Ravel and colleagues  substituting in the qPCR data for L. iners as described in the methods. The majority of our samples were found to be either biotypes I (36%), II (25%) or IV (19%), biotypes III (14%) and V (6%) were also observed.
Specifically considering the relative representation of Lactobacillus species that define biotypes I, II, III and V, we found significant reductions in the proportion of L. crispatus among samples with high compared to low Nugent scores (p=0.001).
We then applied non-metric multidimensional scaling (nMDS) to a Bray-Curtis dissimilarity matrix created from the genus-level taxonomic classifications normalized across the dataset (Fig. 3). Using this approach we found no clear separation among samples from women determined by Amsel criteria to have BV or those from other samples. However, we did observe some discrimination among women with high or low Nugent scores (Fig. 3A). Analysis of similarities (ANOSIM) supported this separation (ANOSIM R=0.652, p=0.0002). Focusing these approaches to the proportions of 16S rRNA reads determined to be from Lactobacillus or Gardnerella species improved the separation of low and high Nugent categories (ANOSIM R=0.724, p=0.0001; Fig. 3B), but failed to improve delineations based on Amsel criteria. In both analyses, samples with a moderate Nugent score were not significantly different from either low or high Nugent scored samples (R<0.34, p>0.8). Our ability to discriminate disease states did not improve using species-level (3% sequence similarity) Operational Taxonomic Units (OTUs; data not shown).
Next we sought a metabolic description of BV. We subjected each of the samples to GC-MS analysis using both derivatized and underivatized lavage samples to examine the relative abundances of both volatile and non-volatile metabolites (Table S2). We identified 176 distinct metabolites across the 36 samples (Fig. 4). We used the resulting data to construct a Bray-Curtis dissimilarity matrix, which resulted in nMDS and ANOSIM analyses that had significant increases in discriminatory power (ANOSIM R>0.895, p<0.001; Fig. 3D). The vaginal metabolomes fell into two broad clusters (Fig. 4) distinguished by 48 metabolites, which we will refer to as metabotypes I and II (Table 3). Samples from women displaying symptoms of BV formed discrete subgroups within each metabotype (henceforth referred to as SBVI and SBVII). The two symptomatic BV-types were also evident in the nMDS analysis (Fig. 3D). Samples with a Nugent score≥7 fell into both metabotypes, capturing all SBVI samples, but less than half of the SBVII cases. We observed significant differences in the relative concentrations of a total of 67 metabolites among samples with high Nugent scores (n=37), SBVI (n=52), or SBVII (n=14), and other samples (>3-fold and p <0.05; Fig. 5 and Table 4). SBVI and SBVII shared eight metabolites (Fig. 5). Nearly all significantly affected metabolites from samples with high Nugent scores and the majority of those affected metabolites in SBVI and SBVII, including those shared between the two symptomatic BV types, corresponded to decreases of the respective metabolite in the defined diseased state. The exceptions were putrescine, cadaverine, 2-methyl-2-hydroxybutanoic acid, hydroxylamine, glycolic acid, tetradecanoic acid (significantly affected in SBVI) and butyrolactone (significantly affected in SBVII), all of which were significantly increased in both symptomatic BV metabotypes. Eight metabolites were not detected in either SBVI or SBVII, but were found in all other samples (Table 4). This included the metabolites, 2,3-hydroxypropyl-2-aminoethyl phosphate (type I), cis-11-octadecanoic acid, and ribose-5-phosphate, whose relative concentrations were low, and varied to such an extent within the Amsel-negative samples that they were not detected as significant.
Exploiting all the collected experimental and demographic data, we applied a network approach to ascertain the linear parametric relationships among relative taxon or metabolite abundances; parametric patient metadata and Nugent scores; and their non-parametric relationships with patient demographics, symptoms and non-parametric metadata (Fig. S2 and Table S3). Using randomized variables, we determined that significance thresholds of ≥0.6 or negative correlations of ≤−0.4 for either Pearson's (parametric) or Spearman's (non-parametric) correlation coefficients were most informative of significant correlations (Fig. S3 and Fig. S4). A network constructed from these data revealed a complex web of interactions defining the BV state.
In the resulting network, Nugent score was inversely correlated with lactic acid (i.e. an increase in Nugent score=a decrease in lactic acid; Pearson's R=−0.68), but was only indirectly connected via glutamic acid (Spearman's R=−0.74) to symptomatic BV, as defined by Amsel criteria (Spearman's R=−0.58; Fig. S5). Nugent score was not correlated with any of the characteristic BV symptoms (discharge, odor, vaginal pain, itching), but was inversely correlated with L. crispatus (Pearson's R=−0.49) and Ravel's biotype I  (Spearman's R=−0.62). Both a high Nugent score (Spearman's R=−0.60) and BV as determined by Amsel criteria (Spearman's R=−0.41) were negatively correlated with sedoheptulose (Fig. S5A), which was inversely correlated with Ravel’s biotype IV  (Spearman’s R=−0.47) and the undertaking of anal sex (Spearman’s R=−0.51).
SBVI and SBVII formed direct links to numerous metabolites and certain microbial genera. SBVI was positively correlated to the genera Mobiluncus (Spearman’s R=0.80) and Allisonella (Spearman’s R=0.61), while negatively correlated with 15 distinct metabolites (Fig. 6). SBVII was positively correlated with Hallella (Spearman’s R=0.60), while negatively correlated with 53 metabolites. Interestingly, African American women were indirectly connected to SBVII through Hallella (Spearman’s R=0.6; Fig. 7). All but one BV-affected African American women were determined to be SBVII, although ethnicity was not the sole defining determinant, as two non-African American women were also SBVII.
Douching (parametrically determined as days since last douche in women who douche and contrasted to women who do not douche) and SBVII shared common negative connections to 21 different metabolites. However, as was observed for ethnicity, douching was not a universal determinant of SBVII. Douching appeared to have a highly perturbing effect on the vaginal environment. Douching correlated with changes in the abundance of 64 metabolites (Fig. 7), including increased levels of several sugars (e.g., Fructose, Mannose, and Galactofuranose), 2-hydroxyglutarate, α-ketoglutarate, and N-acetyl-glucosamine (a predominant component of mucin), and reductions in many fatty acids (e.g., Docosanoic acid, Dodecanoic acid, Eicosanoic acid, Erucic acid, Heptadecanoic acid, Lignoceric acid, Octadecanoic acid, Palmitic acid, Succinate and Fumarate), fatty acid derivatives and cholesterol. Douching was negatively correlated with Mobiluncus spp. (Pearson's R=−0.43), Pseudomonas spp. (Pearson's R=−0.47), Mycoplasma spp. (Pearson's R=−0.44) and Allisonella spp. (Pearson's R=−0.42) and was positively linked to increases in the genera Cloacibacterium (Pearson's R=0.89), Morganella (Pearson's R=0.89) and Clostridium (Pearson's R=0.64).
Douching also did not appear to separate metabotypes I and II, affecting just nineteen of the 48 metabolites separating the two metabotypes (Table 3). Although not correlated with SBVI or SBVII, Veillonella (Spearman's R=−0.53), Gardnerella (Spearman’s R=−0.50), Aerococcus (Spearman’s R=−0.47), Atopobium (Spearman’s R=−0.45), and Peptostreptococcus (Spearman’s R=−0.41), along with Ravel’s biotype IV (Spearman’s R=−0.44), were all negatively correlated with metabotype I.
We found correlative links among symptoms, microbial genera, and metabolites (Fig. 6). The symptoms ‘itching’ and ‘vaginal pain’ were linked (Spearman’s R=0.64) to each other, as well as to diethylene glycol (Spearman’s R>0.6) and several distinct metabolites. The symptoms ‘odor’ and ‘discharge’ were linked (Spearman’s R=0.79) to each other, but were separate from ‘itching’ and ‘vaginal pain’ (Fig. 6). Odor was strongly associated with increased levels of putrescine (Spearman’s R=0.84) and cadaverine (Spearman’s R=0.82), and was linked to SBVI (Spearman’s R=0.85), but not to SBVII (Spearman’s R=0.11). Discharge was more strongly linked to the metabolite 2-methyl-2-hydroxybutanoic acid (Spearman’s R=0.73), and was strongly linked to all samples described as BV by the Amsel criteria (Spearman’s R=0.84).
Putrescine (Spearman’s R=0.81) and 2-methyl-2-hydroxybutanoic acid (Spearman’s R=0.62) were linked to increases in Mobiluncus spp. Putrescine was also linked to Prevotella spp. (Spearman’s R=0.69), Anaeroglobus spp. (Spearman’s R=0.61) and Porphyromonas spp. (Spearman's R=0.60). Cadaverine (Spearman's R=0.7) and putrescine (Spearman's R=0.73) were both linked to increases in Dialister spp., and cadaverine was also linked to increases in Streptococcus spp. (Spearman's R=0.84) and Mycoplasma (Spearman's R=0.66). Diethylene glycol was linked to multiple genera, including Stenotrophomonas spp. (Spearman's R=0.85), Janthinobacterium spp. (Spearman's R=0.85), Morganella spp. (Spearman's R=0.85), Comamonas spp. (Spearman's R=0.84), Delftia spp. (Spearman’s R=0.84), Pseudomonas spp. (Spearman’s R=0.74), Gardnerella spp. (Spearman’s R=0.67) and Acinetobacter spp. (Spearman’s R=0.61).
In terms of demographic parameters, marriage formed a direct negative correlation with the Amsel-determined BV state (Spearman’s R=−0.48). However, most of the subjects' demographics and behaviors were many steps removed from either Amsel-determined BV type or a high Nugent score within the resulting positive association network (Table 5).
Spiegel and colleagues  previously compared VFA profiles between BV-positive and asymptomatic women, finding an increase in succinate, acetate, butyrate and propionate with a concomitant decrease in lactate occurring with disease state. Consistent with this, our metabolomic analyses revealed that the most distinguishing determinant among samples from women with a high Nugent score and those with a low-moderate Nugent score was a marked decrease in lactic acid. Although, the relative proportion of lactic acid was not found to be significantly different among women delineated by the Amsel criteria (p>0.05). We also observed an increase in acetate with SBVI and propionate with SBVII. However, we did not observe significant changes in succinate or butyrate for women with BV-positive Amsel criteria, or those with a high Nugent score (p>0.05).
One rationalization for these observations could be a reduction in lactic acid-producing lactobacilli along with increases in other microbes capable of producing acetate and propionate in the vaginal tracts of BV-positive women. Decreases in lactobacilli have been described previously in women with high Nugent scores , as have increases in G. vaginalis , which are known to produce acetate and succinate . The relative proportions of lactobacilli and Gardnerella spp. are generally considered hallmarks of the Nugent criteria . Consistent with this, the relative proportions of Gardnerella spp. and lactobacilli 16S rRNA reads were highly discriminatory among samples with high and low Nugent scores. Gardnerella spp. were significantly more abundant in samples from women with high Nugent scores. However, while the abundance of lactobacilli was clearly reduced in several samples with high Nugent scores, our results did not indicate a significant reduction in overall representation of lactobacilli consistent with Nugent classifications. This finding may have been confounded by the inability of our sequencing primers to detect certain lactobacilli, such as L. iners. However, the addition of qPCR enumerations of L. iners did not affect these findings. Dialister spp. were also observed to be significantly more abundant in samples with high Nugent scores relative to those with low or moderate scores. Human isolates of Dialister spp. are reported to produce propionate , , presumably through the decarboxylation of succinate , which may influence the relative abundances of these metabolites in Amsel-positive samples.
The relative abundances of 16S rRNA reads from lactobacilli, Gardnerella spp., or Dialister spp. were not significantly different among Amsel-positive or -negative samples, suggesting a limited relationship between the relative abundance of these bacterial genera and disease symptomology. 16S rRNA reads belonging to Gardnerella spp. were detected in 50% of Amsel-negative samples, which is consistent with historical reports, where G. vaginalis could be cultured from 58%–68% of healthy samples , . Yet, 16S rRNA reads pertaining to Gardnerella spp. were more commonly detected in women distinguished by either a high Nugent score or by positive Amsel criteria. These observations may reflect recent findings that different strains of G. vaginalis have different metabolic or virulence potentials , , with only a subset of strains being important to the onset of BV or defining the BV state.
In contrast to our diagnostic capabilities using 16S rRNA gene profiles, we were able to clearly delineate Amsel-defined BV-positive samples from BV-negative samples using metabolite profiling. Our analysis revealed the presence of two distinct metabotypes that are delineated by significant differences in 48 discrete metabolites. It was unclear if these metabotypes were affected by microbial or host metabolisms, but we did observe negative correlations between metabotype I and Ravel's biotype IV vaginal microbiomes , as well as biotype IV microbial genera including Aerococcus, Atopobium, Gardnerella, Peptostreptococcus and Veillonella. Amsel-defined BV-positive samples formed distinct groups within each metabotype, which we denoted as SBVI and SBVII.
We observed considerable overlap between the significantly affected metabolites of women with a high Nugent score and those with SBVI, suggesting that Nugent criteria were highly diagnostic for this diseased state. Consistent with this, all SBVIs had a Nugent score≥7. In contrast, no overlap was observed between the vaginal metabolomes of women with a high Nugent score and those with SBVII. In these samples, more than half had a Nugent score<7.
Women defined as BV-negative based on Amsel criteria, but having a high Nugent score, were observed in both metabotypes I and II. Our networking approach suggested that a high Nugent score was only indirectly connected to Amsel criteria, via a negative connection to sedoheptulose, a common urinary metabolite .
The most determinate metabolite for both SBVI and SBVII was a marked reduction in 1,2 propanediol, a hydrogenated derivative of lactic acid. This reduction reaction is carried out by some lactobacilli under pH conditions <5.8 . Therefore, it is unclear whether 1,2-propanediol, and by extension the subset of lactobacilli capable of reducing lactic acid, is critical to vaginal health or if this merely reflects an elevated pH, as is characteristic of a BV state. The reduction of lactic acid also typically results in acetic acid at a 11 ratio with 1,2-propanediol . In agreement with this, we observed a marked reduction in acetic acid, but only in SBVI, which may reflect additional metabolic pathways that utilize acetic acid in the vaginal microbiomes of metabotype II women.
The second most determinate metabolite for each symptomatic BV type was 1-acetoxy-2-propanol, which can be resolved from 1,2-propanediol in the presence of lipase activity . Many of the metabolites significantly affected in samples from women with a high Nugent score or displaying symptoms of BV types were fatty acids, including both SCFA and those with longer chain moieties. These and other metabolites found to be reduced in SBVI and SBVII (such as ethanolamine, glycerol, serine, phosphate) are related to glycerophospholipid metabolism and are highly suggestive of either reduced glycerophosphodiester phosphodiesterase-activity  or rapid utilization of these and other host cell membrane degradation bi-products (such as cholesterol; 2-aminoethylphosphate which were also significantly decreased) in the symptomatic BV state. Consistent with the latter hypothesis, increases in the biomembrane lipid anchor, tetradecanoic acid, an essential structural component of host cell membranes and their derivatives, was observed in the vaginal metabolomes of SBVI patients (Fig. 5). We also noticed significant differences in amino acids characteristic of integral membrane proteins. Amino acids with aliphatic side-chains, predominant in the transmembrane regions spanning the middle of the phospholipid bilayer (leucine, isoleucine, alanine and valine) were all depleted in metabolome samples from women with SBVI and/or those with a high Nugent score. While the amino acids tyrosine and tryptophan, which are typically located at the polar/non-polar interface of transmembrane protein regions were depleted in samples from women with high Nugent scores. The finding that these observations overlap with Nugent scores implies an important role for G. vaginalis and other Nugent-defined morphotypes in the reduction of cell wall integrity. In accord with this, the genomes of two G. vaginalis strains isolated from women with symptomatic BV harbor mucinolytic enzymes .
We tested the interrelationships among microbial genera, metabolites, host behaviors, disease symptoms, Nugent score, and clinician-determined disease state using a systems-based networking approach. Although the nature of these analyses precludes conclusive evidence, they potentiate significant new insight into the underlying mechanisms driving the disease process. Features uncovered by systems-biology approaches can then be explored using more conclusive analyses.
Our networking approach indicated connections between odor and the biogenic amines putrescine and cadaverine. Putrescine and cadaverine are both formed in fish and shellfish by bacteria during decomposition  and have been strongly correlated with the characteristic odor exhibited by spoiled fish , . These matched the profiles of the characteristic odor of BV, described as having an amine or “fishy” smell . Coordinated with increases in these metabolites, we observed decreases in volatile compounds reported to exhibit more pleasant odors, such as 2(5H)-furanone  and 2-ethyl-4-methyl-1,3-dioxolane . Putrescine and cadaverine were not universally increased in SBVII metabolome profiles; however, odor was only clinically ascribed to one of these subjects, compared to all SBVI subjects. In this SBVII individual, levels of putrescine and cadaverine were higher than in any asymptomatic individual. The lack of correlation between odor and SBVII was an interesting observation and may partially explain the linkage of douching to this group. Two recent papers have described the ability of douching to reduce or eliminate vaginal odor , . Douching was not inversely correlated with either putrescine or cadaverine; however, such a correlation would only be expected if these metabolites were found more universally in the absence of douching (i.e., if not douching would equally lead to an increase in these metabolites). Although douching had a significant affect on the compositions of the microbial community and the metabolome, it did not alone explain the delineation between metabotypes I and II.
We observed links between 2-methyl-2-hydroxybutanoic acid and discharge, as well as diethylene glycol and pain. The nature of these linkages is less clear, but it is worth noting that interpretation of metabolomic data can be difficult. Being only privy to metabolites above an appreciable level (typically metabolic end-products or those produced faster than they are utilized) means pathway intermediates, and consequently the pathways leading to these metabolites are not "observed". This makes a strong case for combining metabolomics approaches with metagenomics and metatranscriptomics.
We identified links between four metabolites that appeared important to symptomology (putrescine, cadaverine, 2-methyl-2-hydroxybutanoic acid, and diethylene glycol) and several microbial genera. For example, Dialister spp., detected in all SBVI samples, were linked to increased levels of both putrescine and cadaverine and Gardnerella spp. were linked to increased diethelene glycol levels, while Mobiluncus spp. were linked to increases in 2-methyl-2-hydroxybutanoic acid, supporting roles for these bacteria in defining the BV state.
With the exception of douching, most factors relating to subject behavior were many steps removed from BV-positive Amsel criteria or from a high Nugent score within the resulting positive association network (Table 5). We therefore hypothesize that, while personal habits and sexual practices may be influential, or even be essential precursors, additional factors contribute to the onset of BV. Marriage had a strong negative correlation with Amsel-determined BV-positive women, but we did not observe similar associations with promiscuity or sexual frequency, suggesting changes in these behaviors did not account for this observation. Many speculations could be posited for this, but we hypothesize it relates to findings that couples share more similar microbial types , such that intercourse between stable sexual partners results in less perturbation of the vaginal microbiome.
Overall we have shown clear delineations between the vaginal metabolomic profiles of women displaying BV-positive Amsel criteria and those from asymptomatic women. Our results hint at the possibility of two different paths to the symptomatic BV state, with each sharing a common disruption of host epithelial cell integrity, but differing in some symptoms, implicative microbial taxa and their detectability by the Nugent method (Fig. 8). Based on these observations and an accumulation of evidence over the years, it is becoming increasingly clear that the break down of a typically lactobacilli-dominated vaginal microbiome only increases susceptibility to BV. The onset of disease symptoms requires the presence or absence of certain metabolites that may be positively or negatively affected by certain microbial metabolisms. Furthermore, symptoms of BV (discharge, odor, itching and pain) may not have a common origin, but instead may each be related to separate metabolic processes elicited by the presence of Dialister spp., Gardnerella spp., Mobiluncus spp., or other bacteria. Findings that horizontal gene transfer is extensive within the vaginal microbiome  and that such genetic exchange has led to dramatic disparities in the virulence potentials of Gardnerella vaginalis strains highlight the murky relationship between BV and specific microbial taxa. Metabolomic profiling may offer a fast and cost-effective alternative to other methods enabling definitive diagnoses.
The Institutional Review Boards of the University of Illinois at Urbana-Champaign (#05079, obtained 10-31-11) and the Carle Foundation Hospital (#89689-4, obtained 06-17-11) approved this study. Informed written consent was obtained from all study participants prior to sample collection.
Ectocervical lavage (LA) samples were collected using a 15-ml sterile saline solution (Health Care Logistics, Circleville, Ohio) from 36 individuals, as previously described . Patients were evaluated for BV via the Amsel criteria (vaginal discharge, pH>4.5, odor, clue cells;  by a trained clinician with >20 years experience prior to sample collection. Samples were frozen immediately upon collection and stored at −80°C for up to three years until used. Vaginal smears were also obtained for each subject at the time of clinical examination by rolling a swab across and along the length of the vaginal wall and then onto a glass slide, which was stored at −80°C until Gram stained for independent evaluation by a second trained professional, also with >20 years experience, according to the Nugent criteria . A Nugent score of 7–10 was classified as having a high Nugent score and is often described as having a positive bacterial vaginosis (BV) state, irrespective of disease symptoms.
Genomic DNA was isolated from 0.5-ml aliquots of each collected lavage sample. To each aliquot, 125 µl of 0.5 M Na.EDTA (pH 8), containing 75 mg/ml of lysozyme was added and samples were lysed at 37°C in for 30 min. To each sample, 70 µl of sodium-dodecyl-sulfate was added and protein was degraded by incubation at 55°C in the presence of proteinase K (0.05 mg/ml, final concentration) for 30 minutes. Further lysis was facilitated by three cycles of rapid freeze-thaw, performed by immersion in a dry-ice/ethanol slurry until frozen, followed by heating to 37°C. Protein was precipitated by the addition of 70 µl of NaCl and a 30-minute incubation on ice. Precipitated protein was removed by centrifugation and residual organic matter was removed by phenol:chloroform washes, followed by ethanol precipitation to remove residual salts.
The V1-V3 region of the 16S ribosomal RNA gene was amplified from the resulting DNA by polymerase chain reaction (25 cycles of 94°C (30 s), 48°C (30 s), 72°C (2 min)), using primers 27f-YM (CGTATCGCCTCCCTCGCGCCATCAG-AGAGTTTGATYMTGGCTCAG; and 534r (CTATGCGCCTTGCCAGCCCGCTCAG-(MID tag 1–50)-ATTACCGCGGCTGCTGGCA). The amplicons were pyrosequenced using 454 FLX-Titanium technology at the J. Craig Venter Institute (Rockville, MD). The resulting sequences were quality trimmed using the FastX toolkit (available: http://hannonlab.cshl.edu/fastx_toolkit/). Remaining sequences shorter than 250 nucleotides, with homopolymers longer than 6 nucleotides, or containing ambiguous base calls were removed. Sequences were aligned against the silva database . Potentially chimeric sequences were detected using UCHIME  (<6% sequences detected as chimeric) and removed. The remaining reads were pre-clustered as previously described  and then clustered using ModalClust (https://bitbucket.org/msipos/modalclust). OTUs were defined as sharing ≥97% sequence complete-linkage identity with the most abundant sequence forming the OTU seed. OTUs detected in less than three samples and fewer than three times were removed as possible artifacts.
Lactobacillus iners was detected by PCR using primers 453F (ACAGGGGTAGTAACTGACCTTTG) and 1022R (ATCTAATCTCTTAGACTGGCTATG)  and conditions 95C for 10 minutes, followed by 30 cycles of 95C for 30 seconds, 65C for 30 seconds, and 72C for 30 seconds. A final extension was performed at 72C for 7 minutes. Samples positive for L. iners were then enumerated by quantitative PCR using the same primers with conditions 50C for 2 minutes, 95C for 10 minutes, followed by 40 cycles of 95C for 30 seconds, 65C for 30 seconds, and 72C for 30 seconds. Upon completion, a melting curve analysis was performed to ensure purity of the amplification product. All products showed the same overlapping peak indicating the specificity of the primers.
The raw sequences were deposited in the Sequence Read Archive (http://www.ncbi.nlm.nih.gov/sra) with accession numbers SRR639185.
Two 1-ml fractions of the lavage samples were taken from each subject and dried. One fraction was derivatized, according to  with minor modifications: 90 minutes at 500°C with 80 μl of methoxyamine hydrochloride in pyridine (20 mg/ml), followed by a 60-minute treatment at 500°C with 80 μl mass spec grade trifluoroacetic acid. A 5-μl aliquot of an internal standard (C31 fatty acid) was added to each sample, in the derivatized samples this occurred prior to trimethylsilylation. Sample volumes of 1 mL were injected with a split ratio of 71 into a GC-MS system, consisting of an Agilent 7890A (Agilent Inc, Palo Alto, CA, USA) gas chromatograph, an Agilent 5975C mass selective detector and an Agilent 7683B autosampler. Gas chromatography was performed on 60-m HP-5MS columns with 0.25 mm inner diameter and 0.25 mm film thickness (Agilent Inc, Palo Alto, CA, USA) and with an injection temperature of 2500°C, the interface set to 2500°C, and the ion source adjusted to 2300°C. The helium carrier gas was set at a constant flow rate of 1.5 ml min−1. The temperature program of 5-min isothermal heating at 700°C, followed by an oven temperature increase of 50°C min−1 to 3100°C and a final 20 min at 3100°C. The mass spectrometer was operated in positive electron impact mode (EI) at 69.9 eV ionization energy in m/z 30–800 scan range. The spectra of all chromatogram peaks were then compared with electron impact mass spectrum libraries NIST08 (NIST, MD, USA), WILEY08 (Palisade Corporation, NY, USA), and to a custom library of the University of Illinois Carver Metabolomics Center. To allow direct comparisons between samples, all data was normalized to the internal standard in each chromatogram. The chromatograms and mass spectra were evaluated using the MSD ChemStation (Agilent, Palo Alto, CA, USA) and AMDIS (NIST, Gaithersburg, MD, USA). The retention time and mass spectra were implemented within the AMDIS method formats. Metabolites were determined to be significantly enriched in one or other state if after normalization, their median concentration in that state was >3-fold increased over the alternative state and they achieved a p-value of p<0.05.
Taxonomic profiles from phyla to genus were generated for all reads using the RDPclassifier v2.4 , with a cutoff of 0.7. Species identities were determined using speciateIT (as described in ) and these designations were used to evaluate sample biotypes (I–V), as defined by . L. iners qPCR results were added to speciateIT results based on their proportion of the total bacteria present in the vagina using prior enumerations of the total number of bacteria present in the human vagina (~4×108) . Heatmaps were produced using the gplots package in R (Available: http://cran.r-project.org/web/packages/gplots/index.html), and dendrograms were constructed from Euclidean dissimilarity distances.
The relationships among the samples were compared using Bray-Curtis dissimilarity statistics, following normalization of the data to their total read depth (i.e., the proportional representation of each taxonomic group, or metabolite) and transformation of this data by square-root to reduce the influence of higher abundant over less abundant features. nMDS plots were constructed using this data and visualized in Primer (Primer-E 2007).
For parametric analysis of patient metadata on hygiene and sexual practices with data relating to taxonomic abundances, including L. iners qPCR-determined abundances, and metabolite abundances, pairwise Pearson's product moment correlation coefficients were determined. For non-parametric patient metadata, including patients demographics and symptoms, pairwise Spearman's correlation coefficients were determined. Pearson's and Spearman's correlations were obtained using the ‘cor’ function in R. A set of 1000 random variables, each with 36 random values between 0 and 100 were added to the dataset prior to forming correlation matrixes. The correlations formed between these random variables and our dataset were used to evaluate the appropriate correlation thresholds (Fig. S3 and Fig. S4). Correlations >0.6 or <−0.4 were determined to be significantly discriminative and were mapped into a network by using cytoscape , after transformation into the appropriate format using custom perl scripts. Distances between patient variables were determined using Shortest path (Available at: http://www.cgl.ucsf.edu/Research/cytoscape/shortestPath/index.html).
Lactobacillus iners-specific PCR. Gel image shows the resulting products from a Lactobacillus iners-specific PCR. Sample positions are shown above the loading lanes.
Overall Network. The overall network of Pearson's (between parametric data) and Spearman's (between non-parametric and either parametric or non-parametric data) correlations >0.6 (green) or <−0.4 (red) are shown as edges connecting patient metadata relating to demographics, hygiene and sexual behaviors and sexual practices, OTUs, microbial genera, metabolites and patient symptoms. Sub-networks of this network are shown in text and in Fig S4.
Determination of Positive Correlation Thresholds. Plot shows the proportion of potential connections realized as the correlation threshold increases for our dataset and compared to a set of 1000 random variables, each with 36 random values between 0 and 100. Using this approach we determined that positive significance thresholds of ≥0.6 for either Pearson's (parametric) or Spearman's (non-parametric) correlation coefficients were most informative of significant positive correlations.
Determination of Negative Correlation Thresholds. Plot shows the proportion of potential connections realized as the correlation threshold increases for our dataset and compared to a set of 1000 random variables, each with 36 random values between 0 and 100. Using this approach we determined that negative correlations of ≤−0.4 for either Pearson's (parametric) or Spearman's (non-parametric) correlation coefficients were most informative of significant correlations.
Relationships between Nugent score, Nugent-determined BV and BV Symptomology. Sub-network showing the two paths from Nugent score and Nuget-determined BV (Nugent score≥7) and symptomatic BV as determined by Amsel criteria. Positive connections are shown as green edges, negative connections are shown as red edges.
Microbial taxa and their relative 16S rRNA gene representation.
Metabolites and their relative concentrations.
The authors wish to acknowledge the clinical, diagnostic and clerical staff at Carle Foundation Hospital and Christie Clinic for provision of samples.
This work was funded by the National Institutes of Health (NIH) grant R56 AI085096-01 (www.NIH.gov; PI: B. White), a Carle-UIUC translational research initiative (CUTR) grant (PI: B. Wilson), and a National Science Foundation (NSF) Grant #08-20709 (www.nsf.gov; PI: RS), with additional support from the Institute for Genomic Biology and the Research Board of the University of Illinois at Urbana-Champaign. Parts of this work were also funded by the Mayo Clinic-University of Illinois alliance. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.