|Home | About | Journals | Submit | Contact Us | Français|
Some compounds originating from the human gut microbial metabolism of exogenous and endogenous substrates may have properties that profoundly affect the host's physiological processes. The influence of these metabolites on differences in disease risk among individuals could be mediated by metabolism specific to the gut microbial community composition. In this study, we evaluated the effectiveness of terminal restriction fragment polymorphism (TRFLP) as a biomarker of the fecal microbial community (as a surrogate of gut microbiota) for application in human population-based studies. We tested the effects of experimental conditions on DNA quality, DNA quantity, and TRFLP patterns derived from gut bacterial communities. Genomic DNA was extracted from fecal slurries and the bacterial 16S rDNA genes were amplified and analyzed by TRFLP. We found that the composition of the TRFLP fingerprints varied by different extraction procedure. The best quality and quantity of community DNA extracted from fecal material was obtained by using the QIAamp DNA stool minikit (Qiagen, Valencia, CA) with 95°C incubation and moderate bead beating treatment during the cell-lysis step. Homogenization of fecal samples reduced variation among replicates. Once the TRFLP procedure was optimized, we assessed the methodological and inter-individual variation in gut microbial community fingerprints. The methodological variation ranged from 4.5-8.1 % and inter-individual variation was 50.3% for common peaks. In conclusion, standardized TRFLP is a robust, reproducible, and high-throughput method that will provide a useful biomarker for characterizing gut microbiota in human fecal samples.
Many studies have shown that commensal gut microbes play a significant role in human health (1, 12, 13). The microorganisms in the adult human intestine, which include at least 800 species of bacteria, metabolize compounds that might otherwise be unavailable for human nutrition (6, 9). Bacterial consortia, consisting of numerous species, have the potential to produce bioactive agents from the diet. However, certain conversions are only observed in part of the population suggesting that some people lack the necessary microflora to convert these compounds to chemopreventive molecules (1-3). Thus, inter-individual variation in the gut microbial community may be linked to inter-individual variation in the risk of cancer or other diseases (1, 3, 8).
Gut community fingerprinting techniques, such as terminal restriction fragment polymorphism (TRFLP) analysis (19), potentially offer a rapid overview of inter-individual differences in gut microbial communities. When comparing the TRFLP data generated from different communities, variation can be found in the number and size of peaks and can be evaluated by adapting community parameters such as richness and evenness (7, 21). These data provide quantitative information on the compositional differences of gut microbial communities (25) with the potential to serve as a biomarker in high-throughput population-based studies.
To be useful as a biomarker, TRFLP data need to be highly reproducible and reflect gut microbial community composition. Methodological parameters such as sampling technique and DNA extraction, have the potential to influence the TRFLP fingerprint of microbial community (5). Therefore, obtaining microbial genomic DNA that accurately represents the gut microbial community is important (25). When extracting genomic DNA from a complex matrix such as feces, not only is extraction efficiency of genomic DNA from a wide variety of bacteria a consideration but removal of contaminants that co-elute with the DNA that may interfere with further molecular analyses is important as well. Several studies have explored different DNA extraction and molecular typing methods for application to human fecal microflora characterization (18, 22, 29, 32, 37), however, few studies have evaluated the TRFLP method for large-scale, human population-based study (28, 34).
Here, we report a study designed to optimize tRFLP analysis of the fecal microbial community associated with human population-based studies. We evaluated the efficiency of DNA extraction from human feces using two commercial kits. These kits were chosen because previous studies have shown that they lysed fecal bacterial cells efficiently resulting in a representative genomic community DNA and they both have been shown to remove environmental organic contaminants which otherwise interfere with down-stream molecular analyses (18, 37). In addition, we analyzed the effect of homogenization on variation in DNA extraction and tRFLP fingerprints and the effect of temperature and physical disruption during the cell-lysis procedure on TRFLP fingerprints. Once the TRFLP fingerprint was optimized, we applied the approach to evaluate methodological and inter-individual variation in fecal microbial community fingerprints that showed the reliability of this biomarker for use in human population-based studies. These data can be used to estimate the sample-size needed to characterize the fecal microbiota in human-population based studies.
Three healthy women, aged 32 to 48, donated fecal samples for this study. First, we collected a fecal sample from one woman to explore the best method to extract representative fecal bacterial genomic DNA for TRFLP fingerprinting of the fecal microbiota. To compare the methodological and inter-individual variation as related to tRFLP analysis, we obtained fecal samples from two additional women. All activities were approved by the Institutional Review Board of the Fred Hutchinson Cancer Research Center (IR# 5722), and informed, written consent was obtained from the study participants.
Fresh fecal samples were collected into fecal collection containers (Fisher Scientific, Fair Lawn, NJ). Three grams of feces were suspended and vortexed in 3 ml RNAlater (Ambion, Austin, TX) and 3 ml phosphate buffered saline within 2 hour of defecation and divided into 250 μl aliquots. Alternatively, to test the effect of homogenization on variation in DNA yield and TRFLP, samples were homogenized by OMNI tissue homogenizer 115 (OMNI Inc., Marietta, GA) at 15,000 rpm or 30,000 rpm before aliquoting. The samples were used fresh or stored at −80°C until further analysis. Before DNA extraction, the thawed aliquots were centrifuged at 16,000 g for 10 min and the supernatant containing RNAlater was discarded.
We examined the effect of different DNA extraction methods, physical disruption, and temperature on DNA quality, DNA quantity, and TRFLP patterns. Two DNA isolation kits, FastDNA SPIN kit for soil (Soil Kit; Qbiogene, Irvine, CA) and QIAamp DNA stool minikit (Stool Kit; Qiagen, Valencia, CA), were compared for lysing bacterial cells and extracting DNA from cell lysates. We used bead beating to determine the influence of physical disruption of bacterial cells on genomic DNA yield and TRFLP patterns (16). Briefly, 0.5 g baked 0.1 mm zirconia / silica beads (Biospec Products, Bartlesville, OK) were added to each aliquot and samples were processed for various times (0.5, 1, 5, 10 min) at speed 5.5 on Fastprep system (Qbiogene). All DNA extractions were done in triplicate following the manufacturer's instructions. For the stool kit, bead beating, if performed, was done after ASL buffer from the kit was added. In addition, we examined the influence of lysis temperature on fecal genomic DNA yield and TRFLP patterns. Fecal samples with ASL buffer were incubated at either 70 °C or 95 °C for cell lysis efficiency comparison (29). All extractions were done in triplicate. DNA yield was quantified by determining absorption of DNA at 260 nm. DNA quality was assessed using gel electrophoresis with a 0.8 % agarose gel (20).
Bacterial 16S rDNA was amplified with primer 8-27f (FAM labeled) (5′-6-FAM-AGA GTT TGA TCM TGG CTC AG-3′) and 1512-1492r (5′-ACG GYT ACC TTG TTA CGA CTT-3′) (Operon, Inc. Huntsville, AL (35, 36)). Each 50 μl reaction mixture contained 1X PCR buffer, 1.5 mM MgCl2, 0.1 mM of each deoxynucleoside triphosphate, 0.2 μM forward and reverse primers, 1 U of Taq polymerase (Fisher Scientific), 0.5 mg/ml bovine serum albumin (New England BioLabs, Ipswich, MA) and approximately 200 ng genomic DNA. Cycling conditions were: 3 min of denaturation at 95°C, 30 cycles of 0.5 min at 95°C, 0.5 min at 53°C, 1 min at 72°C, and a final 7 min extension step at 72°C (35, 36).
PCR products were further purified by the QIAquick PCR purification kit (QIAgene) using the manufacturer's protocol to remove unincorporated nucleotides and primers. DNA was quantified by determining absorption of samples at 260 nm and purified DNA (approximately 100~200 ng) was digested overnight at 37°C with 15 U of Hae III and 1X buffer (Invitrogen, Carlsbad, CA) in a 20 μl reaction volume. Digestion products were desalted with 2 μl 3 M sodium acetate (pH 5.2) and 50 μl ice cold 95% ethanol and centrifuged 30 min at 16,000 g at 4°C followed by two washes with 200 μl ice cold 70% ethanol. DNA samples were dried in RC10.10 centrifugal vacuum concentrator (Jouan Inc., Winchester, VA) and re-suspended in 10 μl H2O. Twenty ng PCR-generated DNA from each sample was used for TRFLP analysis. Fragment analysis was done using capillary electrophoresis on ABI 3100 (Applied Biosystems, Foster City, CA) at Genomics Resource of Fred Hutchinson Cancer Research Center. GeneScan ROX-labeled GS500 (Applied Biosystems) was chosen as the internal size standard.
TRFLP profiles were analyzed by Dax (Van Mierlo Software Consultancy, Eindhoven, The Netherlands; (14)). The following two binning criteria were employed to identify the fragment peaks. First, the fragments that differed by less than 2 base pairs in different profiles were considered identical and clustered together due to the systematic instrument error in determining fragment size. Second, the relative peak area ratio, Pi, which was calculated by dividing each individual peak area by the total peak area of each profile, must be equal or greater than 1% to be considered as a real peak to eliminate possible background noise.
The mean and standard deviation of Pi for individual peaks of each triplicate extraction were calculated after the binning process. Analysis of variance (ANOVA) with Scheffe adjustment or t tests were performed using Stata 9.0 (StataCorp LP, College Station, TX) to compare the angularly transformed (i.e., the arc sine values of the square root of the original data) relative peak area ratios among different extraction methods on individual peaks. The number of peaks (richness) and the weighted distribution of the peaks (evenness) were calculated to evaluate the effect of different extraction variables on TRFLP profiles. The total number of distinct fragments in each profile was counted. This number was used as a representative of peak richness (S). Evenness (E), as a measure of peak distribution, was calculated as: E=−∑(Pi)(ln Pi)/lnS (21), where Pi is the relative peak ratio. Classically, these definitions refer to specific species in a community, but not TRFLP peaks (7, 21). In this study, we used them to describe quantitative differences among communities although one TRFLP peak may not be equal to one bacterial species. The square root of evenness of each profile was angularly transformed (i.e., arcsine transformation) and compared by ANOVA using Stata with Scheffe adjustment (31).
In order to estimate our ability to detect a specific peak of interest in a complex microbial community, we mixed varying ratios of amplicons from the 16S of E. coli with amplicons from the fecal microbial community and measured peak area. We amplified a 334 bp segment of the 16S rRNA gene using primers 27F-FAM and 338R (17). This gave us a TRFLP peak that was not found in the community TRFLP traces and therefore, easy to distinguish amongst the other 38 peaks in the fecal traces. The addition of E. coli ranged from 0.25% to 17 % of the total peak area in the TRFLP trace. We used a one-sided t-test to determine the detection limit of the tRFLP analysis by testing whether the peak height of the standard addition was significantly different than the baseline (n=4) (30).
To evaluate the methodological and inter-individual variation of the TRFLP approach, we quantified the variation of Pi for TRFLP peaks in three aliquots taken from the same homogenized sample, and taken from different individuals. To investigate methodological variation in TRFLP patterns, we extracted the genomic DNA from three sub-samples of fecal slurry obtained from one fecal sample for each of three individuals. To investigate the variance in the TRFLP pattern due to PCR, triplicate PCR amplifications were performed on each of three DNA aliquots. To quantify the inter-individual variation in TRFLP patterns, we repeated the above procedure for two more individuals. DNA was extracted using the stool kit with 95°C incubation and 1 min bead beating. PCR and TRFLP were performed as described above. The coefficients of variation (CV) of triplicate PCR on each sub-sample, and of all nine PCRs from triplicate extractions, were calculated across all TRFLP peaks. The angularly transformed relative peak area ratios of nine PCR samples from triplicate extractions were compared by ANOVA with Scheffe adjustment using Stata. Community evenness of each profile was also calculated and angularly transformed as described above. The inter-individual variation of a single peak was calculated as the CV of the mean relative peak area ratios of each individual and the overall inter-individual variation was the mean CV across all common peaks.
The effect of physical disruption of bacterial cells on the quality of genomic DNA extracted was evaluated by gel electrophoresis (Fig. 1). DNA was sheared more severely by longer bead beating although thirty seconds and 1 min bead beating treatments did not notably affect DNA quality. DNA yield varied significantly between different extraction techniques (Fig. 2). In general, more DNA was obtained using the stool kit than the soil kit. Both bead beating and higher incubation temperature treatments resulted in a significantly greater amount of DNA. Moreover, DNA yield increased with longer bead beating time, although the DNA was more sheared (Fig. 1, Fig. 2). In addition, homogenizing fecal samples reduced the variation in the amount of DNA extracted (Table 1). The relatively higher average DNA yield of non-homogenized samples was due to an extremely high value in one of the triplicates whereas the other two were comparable to the homogenized ones. The quality of extracted DNA between homogenized and non-homogenized fecal samples was similar (data not shown).
We assessed the influence of extraction parameters on individual peaks in the TRFLP profiles. Different extraction techniques, incubation temperature, bead beating, and homogenization of the fecal samples were assessed on the angularly transformed relative peak area ratio for individual peaks found in TRFLP profiles.
Significant differences existed in the TRFLP traces between samples extracted with the soil kit and the stool kit. The relative peak area ratios varied significantly for all peaks (Fig. 3, p<0.05, n=3, ANOVA). Two peaks (201 bp and 211 bp) were obtained only by stool kit (Fig. 3). The evenness values were higher for samples extracted with the stool kit than the soil kit if the same bead beating times were used (Table 2).
Incubation temperature did not influence the composition of the TRFLP traces. The TRFLP traces of samples extracted using the stool kit at different incubation temperatures (70°C or 95°C) were not significantly different for all peaks (p>0.05, n=3, t test, data not shown). The transformed evenness values were also not significantly different (p>0.05, n=3, t test, data not shown).
Bead beating, as a means to physically disrupt bacterial cells, significantly altered TRFLP peak composition. Analysis of variance (ANOVA) revealed significant differences (Table 3) in the angularly transformed relative peak area ratios in 11 out of 12 peaks among these profiles. Most strikingly, two peaks (76 bp and 201 bp) were missing (i.e., Pi<1%) in those samples that were not bead beaten (Fig. 4). As compared to samples that were extracted by bead beating, the transformed evenness of non-bead beaten samples was significantly different (p<0.05, n=3, ANOVA, Table 2).
To estimate the detection limit of the TRFLP method and estimate criteria for differentiating noise from sample signal, we added a range of concentrations of a known sample to a TRFLP sample generated from the fecal microbial community (Table 4). We found that a peak that represents less than 1% (p>0.05, n=4) of the total community profile is not significantly different from background (Table 4) and subsequently excluded peaks that represented less than 1 % of the total peak area from the tRFLP community analysis.
TRFLP traces from triplicate aliquots taken from the same fecal samples were highly consistent. The variation in relative peak area ratio (as mean CV across all peaks) among triplicate PCR from the same genomic DNA extract ranged from 2.8 to 8.5%. The variation in relative peak area ratio from single PCR reactions from genomic DNA extracts of three aliquots within the same fecal sample ranged from 4.5 to 8.1%. Additionally, there was no significant difference (p>0.05; ANOVA) in relative peak area ratios of individual tRFLP peaks due to DNA extraction when triplicate sub-samples were compared within three individuals (data not shown).
The inter-individual differences in the TRFLP traces of the microbial communities from three individuals were evident even when all samples were treated in the same way (Fig. 5). There were a number of individual-specific peaks and the relative peak area ratios also varied substantially for those common peaks (Fig. 5). The inter-individual variation of a single peak found common to all three individuals ranged from 22.1% to 95.8% and the overall inter-individual variation was 50.3%, which was much higher than the methodological variation.. The inter-individual differences were also evident on the diversity indices (Table 5), confirming that TRFLP is an effective method for examining and comparing fecal bacterial community structure.
We used the data from the three individual's TRFLP traces to estimate our statistical power given different levels of variation. For example, assuming normal distribution, equal variance, α=0.05, β=0.8, and 10% coefficient of variation within samples (range of 2.1 to 8.5), we need three TRFLP traces to find a 30% difference between two peaks in two different samples(31).
We investigated the effectiveness of TRFLP as a biomarker of the human fecal microbial community. We chose TRFLP of the 16S rRNA molecule over other fingerprinting techniques because it has the advantage of being rapid and reproducible (7, 25). As in other molecular typing methods, there are many variables that could potentially bias the outcome of TRFLP analysis. Here, we tested various DNA extraction parameters that influenced the TRFLP peak composition in community fingerprint profiles including extraction techniques, incubation temperatures, and physical disruption for bacterial cell-lysis. We further evaluated the optimized TRFLP method to quantify methodological and inter-individual variation in microbial community profiles. We concluded that the optimum method for extracting fecal bacterial genomic DNA and subsequent TRFLP analysis was using QIAamp DNA stool minikit with 95 °C incubation and 0.5-1 min bead beating. Using this approach, we can reliably detect peaks that represent 1% of the total peak area in a tRFLP trace (Table 4). Based on these results, we are confident that this approach can be applied to fecal microbial community analysis for human intervention and observational population-based studies.
We evaluated the effect of a variety of fecal bacterial DNA extraction techniques on DNA yield, DNA quality, and TRFLP profiles. Because the TRFLP method is based upon PCR which can be easily inhibited by compounds that co-extract with environmental genomic DNA, we chose to compare two DNA extraction kits which incorporated steps that removed environmental contaminants from genomic DNA. Although processing time was slightly longer, the quantity and quality of the DNA extracted from fecal samples using the stool kit was better than the soil kit (Fig. 1, Fig. 2). The composition of the TRFLP profiles also varied depending upon the extraction kit used (Fig. 3). Using different molecular techniques, others have also shown that the stool kit resulted in higher quality DNA and bacterial community profiles (18, 22, 29, 32). We also confirmed that a higher incubation temperature during the cell lysis procedure improved the DNA yield (Fig. 2) although temperature alone had no significant effect on the relative peak area ratios (data not shown). Bead beating fecal samples prior to chemical lysis resulted in higher DNA yield (Fig. 2) and also introduced additional peaks in the TRFLP profiles (Fig. 4). Others have found that bead beating affected the composition of community profiles as measured by DGGE, possibly due to more efficient lysis of gram-positive bacteria with dense, thick cell walls containing multiple layers of peptidoglycan that can only be effectively broken by mechanical action instead of sole chemical treatment (24, 27, 39). However, an extended bead beating time was not an improvement because genomic DNA was sheared and the TRFLP profiles were similar, indicating that a brief bead-beating treatment was sufficient to break most bacterial cells (Fig. 4; Table 2; Table 3). Rantakokko and Jalava (27) also found increased shearing of genomic DNA correlated with longer bead-beating times. Fecal homogenization before sample processing made the sample aliquoting easier and it reduced the variability of DNA yield and the relative peak area ratio of individual peaks in TRFLP profile (Table 1). We also show that we can reliably detect peaks that represent 1% of the total peak area (Table 4). Therefore, we concluded that extracting fecal bacterial DNA using the stool kit with 95°C incubation and moderate bead beating (0.5-1 min) on homogenized fecal sample gave representative data without significant omission of peaks from the fecal microbial fingerprinting profile.
Primer choice and nucleotide mismatch between primer and target genomic DNA may influence the resulting TRFLP fingerprint of the gut microbial communities. The universal primers (27f and 1492r) were used to amplify the 16S rRNA genes. We cannot exclude the possibility that different bacterial rRNA species, such as Lactobacillus and Actinobacteria, were under-represented because they were not amplified with the same efficiency due to nucleotide differences in these PCR priming regions (26). Although it was possible that distortion of community composition was introduced here, we believe that this was unlikely to affect our conclusion about the influence of extraction variables on TRFLP patterns. Even though the data might have been biased due to systematic error, the results showed that random error was minimized; thus, the differences among TRFLP profiles were reproducible. In addition, TRFLP analysis can be used to characterize these under-represented groups by using group-specific primers instead of universal primers (4, 15).
Several studies have raised concerns of the effects of partial enzyme digestion on TRFLP pattern analysis (10, 11, 23). In these studies, it was concluded that inconsistent fragment patterns were due to incomplete digestion. Partial digestion could be caused by the blocking of restriction sites, and /or chimeric or 5′ overhang structures of PCR product (10, 11). In our study, high concentrations of endonuclease and long incubation times were used in an attempt to minimize possible incomplete digestion. Our protocol resulted in consistent relative peak area ratio data of triplicate PCR samples (Fig. (Fig.3,3, ,4,4, ,5)5) and reliable detection of peaks that represent 1% or more of the total peak area (Table 4). Although incomplete digestion could still have existed when enzyme and incubation time were not limiting, there was no adequate way to adjust for this problem. Osborn et al. (2000) suggested that a parallel experiment with reduced amount of enzyme could help identify those potential pseudo terminal restriction fragments (TRFs) which would increase in relative peak area ratio with decreased enzyme concentration (25). However, exclusion of those TRFs completely may be inadequate because real fragments of the same size would also be excluded from the analysis.
High methodological variability can greatly interfere with inter-sample comparison, particularly in human studies where within and between-individual variance is great (38). Therefore, it is important to identify and minimize methodological variation. In our study, the level of methodological variability was lower (4.5-8.1 %) than that found in other TRFLP analyses, which ranged from 11.6 to 12.2% (25). Inter-individual variation in peaks common to all three study participants was 50.3% and the diversity indices varied as well suggesting that composition and relative abundance of the microbial community varied substantially from individual to individual (Fig. 5; Table 5). Similarly, other studies using various molecular typing methods also revealed high person-to-person variation in both gut microbial community composition and relative abundance (24, 28, 33, 34). However, they did not assess the methodological variability and inter-individual variability at the same time. Our study will be useful for estimating sample sizes required for human population-based studies of disease risk as influenced by the gut microbial community.
In conclusion, for TRFLP analysis, the most effective approach to extract fecal bacterial genomic DNA was using QIAamp DNA stool minikit with 95 °C incubation and 0.5-1 min bead beating. Homogenizing the fecal sample improved sample handling and reduced variance in DNA yield and TRFLP profiles. The TRFLP approach, when standardized, was reproducible and informative for characterizing the microbial community and inter-individual differences in human fecal samples. The advantage of this method is that the TRFLP profiles can be used as discrete units for comparative analysis without knowing their particular content, although clone libraries can be used to identify the composition of the peaks. Moreover, a combinatorial approach of nested primers that are specific for phylogenetic clusters can be used to give more targeted species resolution (4). However, our goal was to establish overall patterns in order to elucidate similarities and differences between microbial communities among individuals rather than to identify each bacterial species in the fecal samples. With the awareness of its limitations, TRFLP can serve as a useful biomarker of microflora community structure and provide a powerful tool for population-based epidemiologic studies of gut bacteria and health outcomes.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.