DNA from rumen solid or liquid material was extracted for PCR amplification with primers specific for prokaryotic 16S rRNA and fungal 18S rRNA genes. Sequences were trimmed for quality using LUCY 
, which has been shown to help reduce overestimation of OTUs 
that commonly occur from 454 pyrosequencing errors 
. All primer-trimmed sequences that passed the length cut-off were analyzed with MOTHUR 
, with a species-level OTU definition of 97% sequence identity (i.e., 3% divergence).
Assessment of Current Publicly Available Ruminal SSU rRNA Sequences
NCBI, SILVA and RDP repositories of bacterial 16S sequences were queried to retrieve nucleotide sequences annotated as ruminal (see Materials and Methods
for search terms) in order to compare new data to existing data. The query resulted in 22485, 12153 and 15637 sequences respectively (Table S1
). The sequences from all three public repositories were combined to form a reference dataset (indicated as “REF” in and Table S1).
These reference sequences from all three repositories were then aligned against a reference 16S sequence alignment. Based on this alignment (Figure S1
), the V1–V3 region of the 16S sequence seemed to be slightly overrepresented in the public repositories. In general, the sequences deposited in the three repositories spanned the length of the entire 16S sequence. Overall, the public repositories contained 14332 unique sequences aligning to the V1–V3 region of the 16S gene (Figure S1
). These previously discovered sequences were compared to the sequences generated in this study. Utilizing an OTU approach to cluster the publicly available sequences based on their sequence similarity, the public repositories contained approximately 4670 distinct, ruminal, species-level bacterial OTUs, a number slightly less than previously reported from the public domain 
. This difference may be due to the way the reads were processed. In this study, only those reads mapping to the V1–V3 region were clustered, while Kim et al
. did not single out a specific region, leaving the potential for reads originating from the same species, but mapping to different regions of the SSU rRNA gene, being clustered into different OTUs due to a lack of sequence overlap. In addition, the Kim et al
. dataset was based on a multiple sequence alignment against the Greengenes database 
, while this study used CD-HIT 
to determine sequence identity.
Diversity among 12 cows and public repositories.
Ruminal archaeal and eukaryotic SSU rRNA sequences were retrieved from public repositories using keyword searches similar to those used for bacteria (above). A total of 4198 ruminal archaeal SSU rRNA sequences in NCBI, 1120 in SILVA and 3703 in RDP were retrieved. A reference dataset, combining the sequences from the three repositories (“REF” in ), contained 2484 unique sequences that aligned to the V1–V3 regions (Figure S2
). Subsequent analysis of the 2484 sequences clustered at 97% identity detected 486 ruminal OTUs in the three repositories. A search for eukaryotic 18S sequences of the rumen in SILVA and NCBI databases yielded 1027 and 1803 sequences, respectively. Only approximately 10% of the eukaryotic sequences retrieved were annotated as fungal. RDP was not queried for eukaryotic sequences, as it does not house 18S rRNA sequences. Based on the alignment to the 18S reference sequence, a region matching the region sequenced in this study (Figure S3
) yielded 168 OTUs.
Detailed results of the overlap of OTUs per repository for SSU rRNAs are presented in Table S1
and Figure S4.
For each microbial domain, NCBI consistently had the greatest number of unique OTUs, but not all-inclusive, justifying combining all three databases. Representative bacterial, archaeal, and eukaryotic OTU sequences from each repository were pooled to generate three reference datasets, which were subsequently used as a benchmark for diversity found in the current study.
Current Study Versus Public Repositories
To determine if maximal microbial species-level OTU richness within the rumen has already been obtained, SSU rRNA was sequenced from rumen contents of 12 individual animals and compared to the same region (V1–V3) of the public repository OTU representative reads. Because there was variation in sequencing depths among the samples in this study, a random set of 5520 bacterial, 82 archaeal and 1046 fungal sequences were chosen based on the sample with the fewest sequence reads before entering the QC pipeline. Random subsampling of libraries to equalize the number of reads per sample has been suggested as one solution to obtaining unbiased non-parametric richness estimates of microbial community diversity from NextGen sequencing data 
. Upon completion of the QC pipeline (see methods), a total of 23493, 138, and 2089 unique bacterial, archaeal, and fungal sequences were identified, respectively. The reads were subsequently clustered into 4367, 20, and 52 species-level OTUs, respectively ()
. Good’s coverage 
, a measure of coverage of dominant taxa (i.e., those OTUs with more than one sequence), was 67% for Bacteria
, and higher in Archaea
(98%) and Fungi
(100%), compare to 81–85% from the public repositories (). This disparity in coverage may be explained in part because the public repository sequences originated from a diverse group of ruminant animals (yellow cattle, reindeer, cattle, etc
.) from different geographical locations. Except for Bacteria
, species richness and evenness of the public domain dataset is greater than that observed from the animals in this study as noted by greater values of all four diversity estimators used.
Out of 4367 bacterial species-level OTUs found in this study, only 1262 (29%) were shared with the three public repositories (). Moreover, for every OTU found in the public repositories, a novel one was discovered from the new data, while still far from maximal Good’s coverage. Conversely, the majority of the archaeal OTUs (90%) found in this study were previously observed in the public databases with only 2 new OTUs added; 1 Methanobrevibacter and 1 Thermogymnomonas, (). For Fungi, only 1 OTU was shared between the public repositories and the animals investigated in this study, suggesting the scientific community is only beginning to realize the extent of fungal diversity of ruminants (). It should be noted that, of the 168 ruminal eukaryotic OTUs in public repositories, roughly 10% were fungal, which further highlights the paucity of rumen fungal rRNA sequences in these databases.
Comparison of rumen SSU microbial sequences to data in public repositories.
Per-animal Bacterial Species Richness
Bacterial 16S rRNA gene sequences from the solid and liquid fractions from each animal were pooled and sampled to generate OTU-based diversity calculations (). As indicated in the “# of usable sequences” column of , almost all animals had at least 5199 LUCY-trimmed, unique, chimera-checked bacterial 16S sequences that were used in the final microbial species richness estimators. The observed number of OTUs ranged from 1903 to 2432 (). The Chao1 non-parametric richness estimator predicted as many as 3116 to 5439 species-level OTUs (). Good’s coverage was between 69–82%, indicating that by using ~5000 bacterial sequences, the dominant bacterial community within the rumen of an individual animal was insufficiently sampled. Taxonomic profiling indicated that Prevotella, Oscillibacter, Coprococcus, unclassified Ruminococcaceae, and Butyrivibrio were the top five most abundant bacterial OTUs present in the rumen, comprising close to 40% of all bacterial taxa observed ().
Sampling microbial diversity across 12 cows.
Bovine rumen microbial diversity among 12 cows.
Per-animal Archaeal Species Richness
An analysis of archaeal diversity was performed and summarized in , by clustering the sequences generated from PCR of the 16S rRNA gene using archaeal-specific primers A109F and A934R 
. Anywhere between 8 and 13 species-level OTUs were found per animal at an indicated coverage of 89–96%. When counting the genus-level taxonomy of these OTUs, a total of 5 archaeal genera were observed with the majority of OTUs being composed of Methanobrevibacter
species were observed in 4 of the 12 animals ().
Per-animal fungal species richness
An OTU analysis analogous to those performed on bacterial and archaeal 16S rRNA gene sequences was performed on fungal 18S rRNA gene sequence reads and summarized in . Briefly, only between 21 and 40 OTUs were identified despite using about a 1,000 sequences per animal. The Good’s coverage 
estimates of 98.4–99.9% indicated that nearly the full extent of fungal diversity in the rumen of 12 animals using primers EF4a and fung5a was captured. In the top 10 genera (), 5 were potentially novel, marked as unclassified at some taxonomic level. Among the known fungal genera, Nectria
were the most abundant, comprising over 25% of the 46 fungal genera detected ().
Rumen Solid Versus Liquid Phase Species Richness
Samples of ruminal content from all 12 animals were separated into solid and liquid fractions (Text S1
). To determine if observed differences in bacterial OTUs between liquid and solid fractions as measured by taxonomic profiling and PCA were statistically significant, the Wilcoxon non-parametric t-test corrected for multiple hypothesis testing 
was implemented. Bacterial biodiversity of seven genera differed significantly (P<0.05) between liquid and solid fractions of the rumen contents (, Table S2
), while there was no statistical difference observed between the liquid and solid fractions in biodiversity of Archaea
(), and Fungi
(both members of the order Bacteroidales
) were overrepresented in the liquid fraction of the rumen ()
. Conversely, Butyrivibrio
(both members of the order Clostridiales
) were significantly overrepresented in the solid fraction of the rumen. These results are consistent with previous observations that Prevotella
are more prevalent in the liquid fraction of pasture-fed cows 
and bermudagrass hay- or wheat-fed steers 
. Likewise, Butyrivibrio
, a member of the family Lachnospiraceae
, was also shown to be more abundant in the solid fraction of pasture-fed cows 
and bermudagrass hay- or wheat-fed steers 
. However, the Tannerella
) results vary more across these studies. This may be due to differences in geographical location, diet, time of sampling post feeding, and the genetic background or sex of the animals.
Comparison of microbial diversity in bovine rumen solid (S) and liquid (L) fractions in 12 cows.
Cross-domain Analysis of the Microbiome
Little is known regarding cross-domain interactions among the inhabitants of the bovine rumen. To address this void, a comprehensive analysis of the patterns of abundance of Bacteria
were determined across the 12 animals. Based on comparison of the relative abundance of OTUs in different domains, especially in the case of Bacteria
whose abundances differed by several orders of magnitude, a log transformation was applied to the raw OTU counts. These log-transformed abundances observed across 12 cows were then hierarchically clustered based on distance calculated as 1– |r| (where r is the linear correlation coefficient) of any combination of two microbial taxa and represented as a heatmap ().
To determine statistical significance of observed correlations, fdrtool 
was used to calculate false discovery rate-corrected (FDR) q-values for each correlation coefficient. In all, 74691 pairwise combinations of genera were analyzed to produce 1424 significant (qval <0.05) correlations (Table S3
), too many to interpret manually. However, on a taxonomic level of class, 1275 possible correlations were calculated and 10 were significant (qval <0.05) (Table S4,
and noted with asterisks and boldface in )
. Notably, abundance of an unclassified fungal class of subphylum Pezizomycotina
was inversely correlated with Caldilineae
Subdivision 5 Bacteria (RDP taxonomy from MOTHUR). The Subdivision 5 is a class of uncultured Verrucomicrobia
that was first identified from a hydrocarbon-contaminated aquifer 
. Abundance of the fungal class Tremellomycetes
was positively correlated with abundance of bacterial class Verrucomicrobiae
and negatively correlated with abundance of bacterial class of Gemmatimonadetes
. On the other hand, abundance of one member of Pezizomycetes
(fungal class) was positively correlated with abundance of Halobacteria
(two members of the Archaea
). Both Halobacteria
were observed in only one animal (C9, ) as Halogeometricum
, respectively. Future studies are needed to verify these cross-domain correlations and to provide a biological explanation for them (e.g., which community members are potentially metabolically interchangeable).
Cross domain OTU comparison based on abundance pattern correlations.
Novel Fungal Taxa
To investigate the striking disparity between the fungal sequences identified in the current study and the sequences currently available in the public repositories (), a phylogenetic tree was inferred, illustrating taxonomic relationships among the representative OTU sequences (). Of the 71 total fungal OTUs identified in this study, only 53 grouped near a previously deposited sequence (gray and black leaves, ). The most abundant OTU identified in this study represented by over 4620 sequences and present in all 12 animals with sequence similarity to Aschochyta pisi, a known fungal pathogen responsible for blight of common crops such as peas. The second most abundant fungal OTU identified in this study resembled a recently characterized species of Aspergillus (PSBORB-4, Genbank accession HQ393873.1, unpublished). This OTU was represented by 4612 sequences and was also identified in all 12 animals. Aspergillus isolate PSBORB-4 clustered with Aspergillus proliferans; however, based on read counts, PSBORB-4 was over 220 times more abundant. Furthermore, two additional OTUs that were classified as “Ascomycota” and “Aspergillus” clustered with PSBORB-4 and A. proliferans, alluding to potentially novel Aspergillus species that are yet to be discovered.
Phylogenetic diversity of fungal 18S rRNA sequences in bovine rumen.
Close to 40% (black leaves in ) of the 53 BLAST-matched OTUs were rudimentarily annotated in the public repository (e.g., “uncultured soil fungus”, and without any definite taxonomic classification). The remaining 20 OTU-representative sequences did not match any known targets in the recent release of the nt database at NCBI (red and blue leaves in ). For these 20 sequences, least common ancestor (LCA) analysis revealed a possible taxonomic placement for 11 sequences (indicated as blue leaves in ) leaving 9 sequences as unclassifiable, (red leaves in ), representing presumably novel taxa of the bovine rumen. Our analysis showed that the current landscape of the fungal diversity in the rumen is largely incomplete. Specifically, that there is a greater than previously appreciated diversity of Pleosporales, Neocallimastix, Sordariomyceteideae, Udeniomyces and others.
Sequencing of the SSU rRNA gene of Bacteria and Fungi from bovine rumen suggests that the compositional characterization of the rumen microbiome is incomplete with several novel fungal taxa being discovered despite targeting the less specific 18S rRNA gene. In contrast, a comparison of archaeal SSU rRNA sequences with sequences from three public repositories resulted in only 2 new species. Bacterial community profiles differed between liquid and solid (fiber) fractions while the archaeal and fungal communities appeared indifferent. Integration of prokaryotic and fungal data sets highlighted the cross-domain correlations among the abundances of rumen inhabitants. Future studies should focus on exploring these dependences further via metagenomic and functional analysis of the bovine rumen.