With a robust sampling, sequencing, and analysis approach, we generated the first comprehensive catalogue of the vaginal microbiome in pregnancy across subsite and gestational age. When compared to the non-pregnant vaginal microbiota, the community is uniquely and distinctly structured during pregnancy ( and ), in ways that cannot be attributed to alterations in BMI (Tables S1
), to subject race or ethnicity, nor to readily identifiable clinical confounders. Interrogations of discrete contributors to community diversity revealed that the vaginal microbial community varied in pregnancy by gestational age and proximity to the cervix, but was less diverse and less rich overall ( and ). To our knowledge, this structured molecular study of gravidae is unique in terms of stringency of a parallel clinical approach, sample acquisition from subjects, depth and robustness of analysis, and notable findings. In sum, the vaginal microbiome is distinctly structured by a state of health in most women’s lifetime, i.e.,
Others have taken similar but limited approaches to interrogating the vaginal microbiota in pregnancy. Dominguez-Bello utilized 16S 454-generated molecular signatures to generate vaginal profiles in a limited sample set of 9 subjects at term with delivery (including non-laboring and active laboring mothers) from a remote population of Amerindians 
. In this small sample set, the dominant vaginal taxa varied from mother to mother, also with notable variance in Lactobacillus
spp. However, these investigations did not include parallel sampling of both non-pregnant and pregnant subjects, nor from multiple vaginal subsites 
. However, this study was remarkable for its parallel acquisition of neonatal microbial community sampling. As supported by other studies 
, the infant gut microbiome largely reflects the maternal mode of delivery, although it bears mention that in several studies women were delivered by cesarean for obstetrical indications in active and advanced labor thereby revealing a potential bias by virtue of infant handling in cesarean and vaginal birth and not solely a reflection of fetal descent via the birth canal 
Our study suggests that although human adults have highly differentiated bacterial communities that are relatively stable 
, in such prevalent and healthful states as pregnancy the vaginal community in particular shifts naturally in its structure with respect to diversity and richness. Indeed, Ravel et al
have previously reported that the vaginal microbiome in healthy, reproductive-aged women occupies states dominated by Lactobacillus iners
and Lactobacillus crispatus
specifically in association with low vaginal pH 
. Others speculated 
that vaginal community changes with pregnancy, but ours is the first direct evidence as such. Indeed, our findings suggest that at least among reproductive aged women, the vaginal microbiome remains a dynamic community in adult reproductive life, and that terminal differentiation does not occur per se
. Moreover, we observe persistent relative prevalence (but not sole nor absolute predominance) of Lactobacillus
(, AbundantOTU). However, across the entirety of our study population, less diversity and richness occurred in measured variance throughout weeks of gestation and in proximity to the uterus (posterior fornix), leading us to speculate on variances within the cusp of preterm viability. Of interest, in subjects closer to term OTU-based projections suggest that the non-pregnant community structure may return to some extent in the latter weeks of gestation. Our study is potentially limited by employing a cross-sectional comparison in gravidae (), relative to a limited number of non-pregnant subjects with multiple samplings (18/301 specimens represented thrice sampling, see methods). Alternatively, when we compared only first samplings of non-pregnant women to gravidae, we still observed consistent cluster separation (data not shown). The most robust method to formally address the structure shifts in pregnancy would be to employ a longitudinal approach whereby each subject is sampled at ongoing weekly intervals across pregnancy. However, that is outside the scope of this initial study.
We opted to employ a parallel sampling strategy by stringent inclusion and exclusion criteria to the Human Microbiome Project. While this enabled us to make true comparisons to a large, robust, and unparalleled dataset of non-pregnant subjects, it similarly opened the possibility that we were sampling an unperturbed but not “normal” population. However, it bears mention that our outcomes among gravidae were entirely what might be anticipated in a health pregnant population (), and did not differ significantly among subjects. As with all large human cohorts, our study is prone to both alpha error and induced bias. We attempted to minimize error and bias with a single physician performing all subject sampling among both cohorts, and all samples being extracted from primary specimens within a single laboratory utilizing a common and rigorously tested protocol (HMP).
It remains a distinct possibility that our significant observed community clustering with the vaginal microbiome being evidentially structured by pregnancy reflects a secondary trait in our pregnant population, but not gravid condition itself. Of note, we did not exclude gravid subjects by virtue of posterior fornix vaginal pH. In contrast, in the non-gravid HMP cohort subjects were excluded at the time of screening if the posterior fornix pH exceeded (see Methods). Given that <10% of screen failures (and <4% of the entire potential cohort) met such pH criteria for exclusion, we feel that this is an exceedingly unlikely potential confounder or bias. However, it cannot be formally excluded as such. To this end, in the recent publication of Ravel et al.,
, the authors reported that while the pH and Nugent scores of each community demonstrated strong correlation between high pH and high Nugent scores and the highest pH values were associated with community states not dominated by species of Lactobacillus. It is of importance to note that these investigators employed self-sampling in their study. Nevertheless, the investigators also reported that elevated pH and high Nugent scores were observed in some communities with high proportions of lactobacillus species and that this was most true in communities which contained decreasing proportions of L. iners
. However, this was not universally true, leading the authors to summize that these metrics cannot be predicted with absolute certainty solely on the basis of the proportion of Lactobacillus in a community 
. We concur with these investigators summary statements, and note that when comparing our population of pregnant subjects to the non-pregnant HMP cohort we observed community discrimination by virtue of Lactobacillus species, namely L. iners
, L. crispatus
, L. jensenii
and L. johnsonii
. If our community distinctions were the result of an incidental inclusion of pregnanc subjects with a posterior fornix pH >4.5, then we would anticipate potentially seeing a decreasing proportion L. iners in the community structure. However, the opposite holds true () making confounding by version of pH unlikely.
As the number and robustness of computational approaches to analysis of metagenomics data increases, investigators are faced with distinct methodologic approaches to analyzing community profiles. In any emerging field of study, optimal measures of data analysis are not evident to investigators at the forefront and different methodologic approaches may yield variance in significance of findings 
. With this in mind, we employed a diverse and robust set of bioinformatic tools in analysis of our datasets (see Methods). For community cluster distinction (beta metrics), we analyzed taxa by nonphylogenetic and phylogenetic methods. Regardless of distance metric or phylogenetic analysis, the vaginal microbiome distinctly clustered by virtue of pregnancy (, ,
and ). As AbundantOTU uses a consensus alignment algorithm, thus tending to concentrate on OTUs of greater abundance. Detection of rare species is difficult to differentiate from sequencing error. QIIME denoiser preempts this difficulty with a pre-filter to reduce the needs of all-on-all comparison; each additional unclustered read is compared to the most abundant clusters to discern sequencing error probability from detection of rare bacterial species retained in the OTU table. Regardless of methodology, our results from AbundantOTU and QIIME denoising are strikingly similar in terms of differentiating OTU identified (). This finding is further evident at the species level, as detailed in .
Similarly, for measurement of within community diversity (alpha) we employed variations in data filtering ranging from minimal removal with scant trimmed reads to well-described modest “denoising” and slaying of chimeras. We persistently observed less richness and diversity in pregnant communities when compared with parallel non-pregnant subject cohorts, regardless of computational pipeline, tool employed, or means of data projection ( and ). The limited results analyses at the genus level, and the number of significantly differential OTUs (), both suggest that it is subgenus taxa that most strongly contribute to observed alterations in community structure in pregnancy. This was supported by two complementary rigorous denoising approaches to our dataset. AbundantOTU and QIIME. The former resulted in diminished OTU estimates (868 versus 1,121 OTUs), but both agreed in the predominance of lactobacilli () irrespective of gravid condition and clades differential during pregnancy (). With QIIME denoising, a broader set of taxon differences and less absolute predominance of Lactobacillus could be observed. While each denoising pipeline has its own strengths and limitations, one undeniable observation persists: the vaginal microbiome community is structured by pregnancy and varies with respect to richness, diversity, and specific microbial members.
Employing such robust analyses methods, we were able to detect species which are discriminately and specifically relative enriched in pregnancy (albeit in the face of overall diminished community richness and diversity). These include Lactobacillus iners
, Lactobacillus crispatus
, Lactobacillus jensenii
and Lactobacillus johnsonii
. Although it is outside the scope of this initial manuscript to delve deeply into the species differentiation and clinical implications, these findings are of probable biologic significance nevertheless. For example, L. johnsonii
encodes enzymes and transporters essential for the release bile salt hydrolase and is primarily found in the upper GI tract 
. In addition, the capacity for production of bacteriocins is a broad trait of the lactic acid bacteria and L. johnsonii
production of Lactacin F both limits other lactobacillus as well as Enterococcus species in the GI tract. It’s notable increased dominance in the vagina in pregnancy may be important for establishing the neonatal upper GI microbiota upon delivery, or preserving the integrity of the community to reduce risk of ascending infection or preterm birth.
The vaginal microbiome signature in pregnancy is thus distinct from non-pregnant, and this distinction comprises both from lesser diversity and, to a lesser degree, from the absence and occasionally presence of unique taxa. Our reporting by gestational age and vaginal subsite now lays the foundations for further interrogations into microbial variance, including such presumed pathogen-related perinatal morbidities as preterm birth. Moreover, it lends to the growing understanding of the remarkable dynamic nature of our metagenome and its role in vertical transmission of the microbiota through subsequent generations.