Description of the environmental samples
The two environments sampled within the Soudan Mine are shown in Figure . Groundwater is not very abundant in the banded iron formations of the mine at our sampling depth of 714 m below the surface. However, small amounts of water emerge steadily from exploration boreholes that extend downward from the deepest level of the mine. The two sets of samples were collected from water trickling out of two such boreholes that are separated by about 100 meters (Figure ). This water is a calcium, sodium, chloride solution about twice as salty as seawater. It is anoxic, with up to 150 ppm of dissolved ferrous iron and variable enrichments of several trace elements. At both locations, the water emerging from the boreholes produces cm-scale "Black" environments that appear to extend down into the borehole. The water flowing away from each borehole, on the floor of the mine tunnel, is exposed to the oxygenated mine atmosphere, and transitions to a sequence of "Red" environments within a few cm of the orifices. The oxidized environments are continuously fed by anoxic water flowing from the boreholes. The water in the borehole, which yielded the Black sample, as well as a number of similar sites found throughout the mine, has a pH of 6.70 and redox potential of -142 mV. Some of the Black areas are associated with bubbling of gas. The Black sediment contained 5.8 × 105 microbes per ml. X-ray diffraction analyses of the minerals in this area show that chlorite-serpentine [(Mg,Al)6(Si,Al)4O10(OH)8], clinochlore, ferroan [(Mg,Fe)6(Si,Al)4O10(OH)8], quartz, and silinaite [LiNaSiO5·HCl] are present in the Black sediments. Water slowly flows from the borehole into the stream running down the main mine tunnel. As the water comes in contact with oxygen in the passageway, the pH rapidly decreases to 4.37 and redox potential increases to -8 mV. The Red sample contained 1.2 × 106 microbes per ml, and these sediments include goethite [FeO(OH)], followed by szaibelyite [(Mg,Mn)BO2(OH)], and sussexite [(Mn,Mg)BO2(OH)].
Figure 1 Sampling from the Soudan Mine. The Soudan Mine is an Algoma-type Iron Formation rich in hematite. Panel A shows a cross-section of the mine looking East-North-East at 78.5°. Panel B depicts a three dimensional view of the mine, including the cross-section (more ...)
The first two pyrosequences of environmental samples
DNA was purified from the two samples, amplified using the GenomiPhi procedure (GE Healthcare, Piscataway, NJ), and then sequenced by 454 Life Sciences. A summary of the sequence characteristics determined using the pyrosequencing technique is shown in Table . The raw sequence reads and quality scores [see Additional files 6
] are provided in compressed format.
Summary of pyrosequence data from the Soudan Mine
The two samples produced more than 70 Mbp of sequence data from over 700,000 sequences, and there was no significant skew in the sequence data (as measured by dinucleotide frequency) when the data generated by pyrosequencing was compared to complete genome sequences.
16S rDNA analysis of the samples
The two sequence libraries were compared to the 16S rDNA database from the Ribosomal Database Project[16
]. As shown in Figure , the Black sample was dominated by Actinomycetales such as Brevibacterium
that volatilize sulfur via an organic intermediate and can also break down complex heterocyclic and polycyclic ring structures[17
]. In contrast, members of the Chromatiales, including the genera Chromatiaceae
, and Halothiobacillus
, dominate the Red sample. These chemoautotrophic Bacteria often use the Calvin-Benson-Bassham cycle to fix CO2
through the oxidation of iron or sulfur, and consequently they would be expected to be present in samples from an iron-rich deposit. These two communities are fundamentally different both from each other and from the community identified in the Iron Mountain metagenome[7
]. The community in the Red sample has a much higher species richness than the Black sample, and the differences between the Soudan and Iron mines reflect the iron composition (hematite versus pyrite), temperature, and pH of the various environments[7
Figure 2 Composition of the 16S rDNA sequences from the two samples and comparison of 16S sequences from the 454 libraries and a traditional clone library. The percentage of all sequences from each library in each of the orders is shown for the 454-sequenced Black (more ...)
A16S clone library was created from the Red sample to validate the 454 sequencing approach. Ninety-six clones were sequenced using traditional techniques, and compared to the 16S rDNA database from the Ribosomal Database Project [16
]. The congruity between the 16S genes sequenced in the 454 library and the 16S sequences from the clone library, as shown in Fig. , is quite remarkable.
We also used the 16S sequences to evaluate the randomness of the library. An analysis of 160 bacterial genome sequences in the SEED database [15
] with annotated 16S genes showed that about 1 in 105
bases is from a 16S gene. Based on this estimate, as a rule of thumb the Soudan samples are expected to contain approximately 3,000 bases of 16S sequence in total, or approximately 30 sequences. Twenty four sequences were found to have significant similarity (with an E value less than 1 × 10-5
and a match of 50 bp or more) to 16S rDNA from the Black sample and seventy six sequences were found to have significant similarity to 16S rDNA from the Red sample.
Metabolic potential from the metagenome library
Sequences from both libraries were compared to the SEED database, a curated database of microbial genomes [15
]. The annotations using the SEED interface primarily occur through the development of subsystems, a technique pioneered by the Fellowship for Interpretation of Genomes[15
]. Subsystems are groups of genes that function together, such as the genes whose products are involved in a metabolic pathway, or the group of genes whose products make a cellular structure. A summary of the subsystem hits are shown in Figure , and all matches to subsystems are provided as supplemental data [see Additional files 1
]. These subsystems show that the pyrosequencing generates sequences that represent a large swathe of central metabolism in each of the environments. Common metabolic potential that is expected to be present in sulfur-utilizing chemoautotrophs is represented in the mine libraries, including the Calvin-Benson cycle, inorganic sulfur assimilation, amino acid biosynthetic genes, and so on. The comparison of the subsystem similarities suggested the simple hypothesis that groups of genes (or subsystems) important to a particular environment will be enriched in that environment. To distinguish between ecologically important differences and differences caused by sampling error, a method was devised to identify those subsystems that are statistically significantly overrepresented in one sample when compared to another [11
Figure 3 Subsystems in the Red and Black Samples. The occurrence of classes of subsystems is shown as a percent of all subsystems in each sample for the Red and Black samples. Notes and abbreviations: The subsystem class "Glu, Asp" also contains Gln and Asn. The (more ...)
Subsystems enriched in the Black or Red samples
Table shows subsystems that were determined to be statistically more common, with 95% confidence, in either the Red or Black samples from the Soudan mine. The subsystems that are overrepresented in a metagenome can yield significant insights into the microbial ecology of the environment. A few specific examples are detailed below.
Table 2 Subsystems statistically more likely to be present in either the Red or Black samples. These subsystems are more frequently found among sequences from either the Red or Black samples with a sample size of 5,000 proteins, 20,000 repeated samples, and (more ...)
Several subsystems involved in iron uptake and utilization such as siderophores and ABC transporters for ferrichrome and heme are more common in the Black sample. The overall concentration of iron at the two sites was similar (Table ; Figure ). However, the iron in the Black sample is present as either Fe2+ dissolved in the water or as ferroan [(Mg,Fe)6(Si,Al)4O10(OH)8]. In either case, the ferrous iron can not be assimilated biologically, and the microbes are forced to scavenge for the limited ferric iron (Fe3+) available. In contrast, in the Red sample, goethite [FeO(OH)] is present and ferric iron is more readily available for biological utilization. The Black sample is enriched for amino acid degradation pathways and microbes may be assimilating nitrogen or carbon through these pathways. It is not currently apparent from where free amino acids would be supplied.
Water chemistry from Soudan Mine. No significant differences were found for Ca, Mg, Na, K, Li, Al, Mn, Sr, Ba, Si, Cr, Co, Ni, Cu, Zn, As, Se, Rb, Cd, Cs, Pb, total alkalitity, lactate, acetate, formate, chlorate, oxalate, and trace elements.
Figure 5 Cations and Anions found in the Soudan Mine. The pie chart shows the abundance of cations and anions found in the mine. The numbers in parentheses are the concentrations (in ppm) of each ion in the "Black" and "Red" samples respectively. The minor ions (more ...)
The respiratory complexes and cytochrome-C oxidases are more commonly found in the sample from the oxidized environment (the Red sample; Table ). Respiration proceeds via multiple electron transfer steps (Figure ). In an aerobic environment, electrons are passed from hydrogenases to quinones (e.g., ubiquinone, quinone, menaquinone, and plastoquinone) and then to cytochromes resulting in the conversion of oxygen to water. In anaerobic environments the electrons are shuffled through nitrite and nitrate reductases, reducing NO3 first to NO2 and then to N2 gas. The Black sample is enriched for these denitrification genes suggesting that the latter pathway predominates while the Red sample is enriched for components of the aerobic respiratory pathway. Moreover, the Black sample had a lower concentration of free nitrate than the Red sample, presumably because nitrate is being used as an electron acceptor during respiration (although nitrite was below the level of detection in both samples; Table ).
Figure 6 Respiration in aerobic and anaerobic environments. Among other potential pathways in the Soudan mine, electrons are transferred from hydrogenases to either cytochromes and then to oxygen to produce water in an oxidative environment, or via nitrate and (more ...)
This analysis demonstrates that by combining pyrosequencing, subsystems analysis, and comparative metagenomics the microbiology of different environments can be correlated with the chemistry and hydrogeology of those environments to identify significant ecological differences between them.
Comparisons between Soudan and Iron Mountain communities
A previous study used Sanger sequencing to determine the metagenome of the Iron Mountain community[7
]. The environmental differences (such as the difference in temperature) account for the predominant differences between the microbial communities. The organismal differences are reflected in the individual biochemistries of the samples [see Additional files 4
]. For example, the AMD metagenome contains significantly more occurrences of Archaea-specific subsystems such as those involved in protein biosynthesis than the Soudan samples. The AMD sample has a preference for CO2
fixation and simple carbohydrate metabolism when compared to either of the Soudan samples. There are also many currently unexplained differences between subsystems found in these environments that must relate the biology of the organisms to the chemistry of the environment.
Comparisons between Soudan and other metagenome sequences
The SEED database used for these studies contained 351 subsystems. The vast majority (83%) of subsystems were present in one or more of the sequenced metagenomes, and over half (52%) of the subsystems are present in every metagenome. A comparison of the subsystem classification reveals trends between the metagenomes (Figure ). For example, oxygenic photosynthesis is prevalent in samples that are naturally illuminated such as the Sargasso Sea[10
]. This analysis also suggested that phosphorous metabolism is more prevalent in oceanic surfaces rather than terrestrial environments. Comparisons of the Minnesota Farm metagenome[6
] with the Soudan Mine metagenomes, also from Minnesota, showed important differences in the production and consumption of secondary metabolites, membrane transport, and fatty acid metabolism. The complete lists of statistically significantly different subsystems between both Red or Black samples and each of the previously published metagenomes are supplied as supplemental material [see Additional files 4
Figure 4 Subsystems present in different metagenome sequences. The subsystems present in the Soudan samples, the Iron Mountain AMD sample, the Minnesota Farm and the Sargasso Sea are shown grouped by family. The red x corresponds to very low abundance or complete (more ...)