|Home | About | Journals | Submit | Contact Us | Français|
Our perspective on microbial diversity has improved enormously over the past few decades. In large part this has been due to molecular phylogenetic studies that objectively relate organisms. Phylogenetic trees based on gene sequences are maps with which to articulate the elusive concept of biodiversity. Thus, comparative analyses of small-subunit rRNA (16S or 18S rRNA) and other gene sequences show that life falls into three primary domains, Bacteria, Eucarya, and Archaea (51, 52). Based on rRNA trees, the main extent of Earth’s biodiversity is microbial. Our knowledge of the extent and character of microbial diversity has been limited, however, by reliance on the study of cultivated microorganisms. It is estimated that >99% of microorganisms observable in nature typically are not cultivated by using standard techniques (1).
Recombinant DNA and molecular phylogenetic methods have recently provided means for identifying the types of organisms that occur in microbial communities without the need for cultivation (see references 1, 20, and 35 for reviews). Results from application of these methods to a number of diverse environments confirm that our view of microbial diversity was limited and point to a wealth of novel and environmentally important diversity yet to be studied (34). It is the aim of this review to collate, compare, and incorporate the results of the environmental sequence-based studies into the context of known bacterial diversity. We discuss the sequence data at the taxonomic level of the phylogenetic division because divisions constitute first-order clades for describing the breadth of bacterial diversity. Although we have yet to determine even the outlines of the bacterial tree, common threads are beginning to emerge that revise our current views of bacterial diversity and distribution in the environment.
In 1987, Woese described the bacterial domain as comprised of about 12 natural relatedness groups, based mainly on analyses of familiar cultivated organisms such as cyanobacteria, spirochetes, and gram-positive bacteria (all of which, based on rRNA sequence divergence, display greater evolutionary depth than plants, animals, and fungi) (51). These relatedness groups have variously been called “kingdoms,” “phyla,” and “divisions”; we use the latter term. For the purposes of this review we define a bacterial division purely on phylogenetic grounds as a lineage consisting of two or more 16S rRNA sequences that are reproducibly monophyletic and unaffiliated with all other division-level relatedness groups that constitute the bacterial domain. We judge reproducibility by the use of multiple tree-building algorithms, bootstrap analysis, and varying the composition and size of data sets used for phylogenetic analyses. The typical interdivisional rRNA sequence difference is 20 to 25%. For comparison, the 16S rRNAs of Escherichia coli and Pseudomonas aeruginosa, both representatives of the γ group of Proteobacteria, differ overall by about 15%; the 16S rRNAs of E. coli and Bacillus subtilis (“low-G+C gram-positive bacterial” division) differ by about 23%.
At the current stage in the phylogenetic classification of Bacteria, divisions are not consistently named or taxonomically ranked. rRNA-defined divisions are identified by classes (e.g., Proteobacteria  and Actinobacteria ), orders (e.g., Thermotogales and Aquificales), families (e.g., Chlorobiaceae), generic names such as the Nitrospira group (11), or common names such as the green nonsulfur (GNS) bacteria and low-G+C gram-positive bacteria (51). Division-level nomenclature has not even been consistent between studies, so some divisions are identified by more than one name. For instance, green sulfur bacteria is synonymous with Chlorobiaceae; high-G+C gram-positive bacteria is synonymous with Actinobacteria and Actinomycetales. Indeed, it probably is premature to standardize taxonomic rankings for bacterial divisions at this point when our picture of microbial diversity is likely still incomplete and the topology of the bacterial tree is still unresolved.
In the past decade the number of identifiable bacterial divisions has more than tripled to about 40 due in significant part to culture-independent phylogenetic surveys of environmental microbial communities (21, 34). These analyses rely on sequences of rRNA genes obtained by cloning directly from environmental DNA or, as in the majority of studies, after amplification by the PCR (1, 20, 35). Figure Figure11 represents the division-level diversity of the bacterial domain as inferred from representatives of the approximately 8,000 bacterial 16S rRNA gene sequences currently available. Although 36 divisions are shown in Fig. Fig.1,1, several other division-level lineages are indicated by single environmental sequences (9, 21, 37), suggesting that the number of bacterial divisions may be well over 40. Several of the described divisions are well represented by cultivated strains and were the first to be characterized phylogenetically (51). The majority of the bacterial divisions, however, are poorly represented by cultured organisms. Indeed, 13 of the 36 divisions shown in Fig. Fig.11 are characterized only by environmental sequences (shown outlined) and so are termed “candidate divisions” to indicate their unsubstantiated status as new bacterial divisions (21). One of these candidate divisions, OP11, is now sufficiently well represented by environmental sequences to conclude that it constitutes a major bacterial group (see below). Phylogenetic studies so far have not resolved branching orders of the divisions; bacterial diversity is seen as a fan-like radiation of division-level groups (Fig. (Fig.1).1). The exception to this, however, is the Aquificales division, which branches most deeply in the bacterial tree in most analyses.
Culture-dependent studies indicate that representatives of some bacterial divisions are cosmopolitan in the environment, whereas others appear restricted to certain habitats (39). Culture-independent studies so far conducted reflect and expand this view. Table Table11 summarizes the environmental distribution of sequences by habitat type, compiled from most of the available 16S rRNA-based clonal analyses: 86 studies contributing nearly 3,000 sequences. An expanded version of this table that details division-level representation in the individual studies is available at http://crab2.berkeley.edu/pacelab/176.htm. Table Table11 includes only divisions for which representatives have been detected in at least two independent studies and for which at least one near-complete 16S rRNA gene sequence is known. Table Table11 is, therefore, not an exhaustive listing of potential division-level diversity for all studies.
Sequence representatives of several bacterial divisions have been identified in a wide range of habitats, suggesting the cosmopolitan or ubiquitous distribution of the corresponding organisms in the environment and, potentially, their broad metabolic capabilities. Some of these cosmopolitan divisions are well-known from cultivation studies; however, others are little known or have not yet been detected by cultivation. Figure Figure22 summarizes the representation of selected cosmopolitan divisions by sequences of cultivated and uncultivated organisms. The Proteobacteria (purple photosynthetic bacteria and relatives), Cytophagales (Bacteroides-Cytophaga-Flexibacter group), and the two gram-positive divisions, Actinobacteria and low-G+C gram-positive bacteria, are well represented by cultivated organisms and therefore are familiar to us in principle. These four divisions account for 90% of all cultivated bacteria characterized by 16S rRNA sequences and approximately 70% of the environmental sequences collated in Table Table1.1. By contrast, other cosmopolitan divisions revealed by clonal analyses, such as Acidobacterium, Verrucomicrobia, GNS bacteria, and OP11, are poorly represented by sequences from cultivated organisms (Fig. (Fig.2)2) and consequently are little known with regard to their general properties. Although many of the bacterial divisions occur widely, others seem to occupy a more limited range of habitats (Table (Table1).1). All cultivated representatives of Aquificales, for instance, are thermophilic hydrogen metabolizers, and all environmental sequences of Aquificales have been obtained only from high-temperature environments. This suggests a specialized habitat niche for this group. Alternatively, the apparently limited environmental distribution may simply reflect a sampling or methodological artifact and representatives of such divisions may be present in a wider range of habitats, but not yet detected.
The database of environmental rRNA sequences is compromised in resolving some phylogenetic issues by a large number of relatively short sequences. More than half of the sequences collated in Table Table11 are less than 500 nucleotides (nt) long, which represents only one-third of the total length of 16S rRNA. This is due to an unfortunate trend in many environmental studies of sequencing only a portion of the gene in the belief that a few hundred bases of sequence data is sufficient for phylogenetic purposes. Indeed, 500 nt is sufficient for placement if some longer sequence is closely related (>90% identity in homologous nucleotides) to the query sequence. In the case of novel sequences, <85% identical to known sequences, however, <500 nt is usually insufficient comparative information to place the sequence accurately in a phylogenetic tree and can even be misleading.
Since all but 4 (40, 46, 49, 50) of the 86 studies collated in Table Table11 were conducted using PCR to amplify rDNA from extracted environmental DNA, the question arises as to whether molecular analyses accurately reflect the division-level diversity that occurs in the environment. It is well established that PCR-associated artifacts such as differential amplification of different rDNA templates (36, 44), sensitivity to rRNA gene copy number (12), PCR primer specificity (48), sensitivity to template concentration (6), amplification of contaminant rDNA (45), and formation of chimeric sequences (23) may skew our assessment of microbial diversity. Most of the studies collated in Table Table1,1, however, analyzed tens to hundreds of clones, so it seems likely that these studies have sampled the main types of sequences in the communities examined. We believe, acknowledging the caveats of the methodology, that the clonal analyses collated in Table Table11 probably include the most abundant (metabolically active) bacterial sequence types in the samples analyzed, likely representing the members of the communities that are involved in the principal metabolic activities, such as carbon cycling.
The rRNA sequence studies of environmental organisms probably identify the abundant organisms in the environments studied and, therefore, account for the organisms that participate significantly in the maintenance of the communities. Because of their abundance in the environment, representatives of some poorly studied phylogenetic divisions are predicted to play significant roles in environmental chemistry. Examples of such divisions, which because of their potential environmental significance merit study, are the Acidobacterium division, the Verrucomicrobia, the GNS bacteria, and candidate division OP11.
The Acidobacterium group is a newly recognized bacterial division with only three cultivated representatives: Acidobacterium capsulatum (18), Holophaga foetida (26), and Geothrix fermentans (28). Figure Figure33 is a phylogenetic dendrogram of this group, including selected environmental representatives. The limited physiological information known about these organisms provides few clues to properties that might be general throughout the division. Acidobacterium is a moderately acidophilic aerobic heterotroph; Holophaga and Geothrix are strict anaerobes that ferment aromatic compounds and acetate, respectively. The majority of sequences that make up this division, however, are from environmental clones. At least eight monophyletic subdivisions in the Acidobacterium group are identified by phylogenetic analyses (Fig. (Fig.33 [24, 29]). We define a subdivision as a lineage comprised of two or more 16S rRNA sequences within a division that are reproducibly monophyletic and unaffiliated with all other representatives of that division. Acidobacterium subdivisions 1, 3, 4, and 6 are well represented by environmental clone sequences from independent studies, yet no cultivated strains are known with the exception of subdivision 1, represented by A. capsulatum. The widespread occurrence of environmental sequences belonging to the Acidobacterium division (Table (Table1)1) suggests that members of this group are ecologically significant constituents of many ecosystems, particularly soil communities. They have been detected in every clonal analysis of soils (with a wide range of chemical properties), as well as in other habitats, including a peat bog, acid mine drainage, a contaminated aquifer, a hot spring, a freshwater lake, and a sample of the Atlantic ocean from a depth of 1,000 m (Fig. (Fig.3).3). In situ single-cell analyses with fluorescent hybridization probes specific for Acidobacterium subdivision 6 small-subunit rRNA indicate that this subdivision is morphologically diverse (29), as expected for a broad phylogenetic group. Members likely are metabolically diverse as well: the depth of phylogenetic diversity (depth of branching) in the Acidobacterium division is nearly as great as in the Proteobacteria.
Verrucomicrobia is a newly proposed division of Bacteria (17) represented by a handful of isolates: Verrucomicrobium spinosum (after which the division is named) (47), four Prosthecobacter species (17), and three strains of ultramicrobacteria (22). Verrucomicrobia and Prosthecobacter are prosthecate bacteria isolated from freshwater, and the ultramicrobacteria, “dwarf-cell” strains only about 0.1 μm3 in volume, were isolated from a soil habitat. All of these isolates preferentially use sugars as growth substrates. Culture-independent analyses indicate that the Verrucomicrobia, like members of the Acidobacterium division, are widespread in the environment and abundant, particularly in soils (Table (Table1).1). Figure Figure44 shows a dendrogram of representatives of the Verrucomicrobia. Several monophyletic subdivisions are seen, only two of which are represented by the cultivated strains. Clone sequences of this division from soil are predominantly from members of the phylogenetically broad subdivisions 2 and 3. The abundance of these two groups suggests their ecological importance. For instance, the abundance of one representative of Verrucomicrobia subdivision 2 (EA25) was estimated by PCR at 107 to 108 cells per g of a pasture soil sample, 1 to 10% of the total microbial content (25).
In our phylogenetic analyses we consistently find that the division Chlamydia is a specific sister group of the Verrucomicrobia. We find no support for the notion (17, 30, 47) of a specific relatedness of the planctomycetes with the Verrucomicrobia.
The GNS bacteria have been recognized as a division-level bacterial group for over a decade (51). Even today, however, this division is still represented by only a few isolates. The cultured representatives have a wide range of phenotypes, from anoxygenic photosynthesis (Chloroflexus) to thermophilic organotrophy (Thermomicrobium). Figure Figure55 shows the relatedness groups of GNS bacteria detected in the environment. It is apparent from the dendrogram that all of the cultivated representatives except the chlorinated hydrocarbon-reducing Dehalococcoides ethenogenes (31) are related in subdivision 3, together with several clone sequences from a hot spring, a rice paddy, and activated sludge (data not shown). By contrast, most of the environmental sequences described to date fall into a different relatedness group, subdivision 1, with no cultivated representatives. Considering the wide variety of habitats that have contributed GNS sequences (Fig. (Fig.5;5; Table Table1),1), particularly to GNS subdivision 1, members of this division likely play significant roles in the environment.
Candidate division OP11 is a recently proposed novel bacterial division for which there is no reported cultivated representative (19, 21). However, several independent clonal studies have reported environmental sequences that together form the OP11 clade. Figure Figure66 shows a dendrogram of the known environmental sequence representatives of the division, with five subdivisions currently identifiable. OP11 sequences all have highly atypical sequence signatures for the domain Bacteria (51), and they have low sequence identities, only about 80%, to sequences outside the OP11 division. This may be due to higher-than-average mutation rates in OP11 rRNAs, as has been suggested for other groups such as the planctomycetes (27). OP11 sequences have been obtained from a variety of habitats including several different types of soil, freshwater sediments, the deep subsurface, and hot springs (Table (Table1),1), suggesting that members of the division play significant ecological roles. Until cultivated representatives of the OP11 division are characterized, little beyond the general properties of Bacteria can be inferred about their physiology.
Several additional candidate divisions have been identified based on environmental sequences alone, shown as outlined wedges in Fig. Fig.1.1. These divisions comprise two or more sequences over 500 nt in length that were obtained mostly from independent studies, or at least from independent PCR events. An expanded view detailing representatives of each candidate division is available at http://crab2.berkeley.edu/pacelab/176.htm. The candidate divisions are identified according to the original source or clone names of the sequences that define the clade. Divisions designated OP were originally identified in an analysis of a Yellowstone hot spring, Obsidian Pool (21). Representatives of three of these divisions, OP5, -8, and -10, also have been encountered in a study of a hydrocarbon-contaminated aquifer at Wurtsmith Air Force Base in Michigan (9). The latter study also identified novel divisions WS1, now identified in a Siberian tundra soil (53), and WS6. Candidate division marine group A was originally identified and named based on partial sequences obtained from marine microbial communities in the Atlantic and Pacific oceans (13) and verified by full-length-sequence representatives of the group from similar marine samples (16). Abundance and depth profiles of marine group A sequences in the water column (16) suggest their global distribution in marine communities; no representatives of this candidate division outside of marine environments have yet been obtained (Table (Table1).1). Representatives of the termite group I candidate division originally were identified as a closely related clade of sequences from the termite gut (33) but now also have been identified in a contaminated aquifer (9). Candidate division OS-K was identified in a study of a Yellowstone hot spring, Octopus Spring (49) and bolstered by additional representative sequences from studies of a hydrothermal vent (32) and marine sediment (7). Candidate divisions TM6 and TM7 are named after sequences obtained in an environmental study of a peat bog (38), and other partial-length-sequence representatives of these candidate divisions were subsequently identified from activated sludges (4, 15) and soil (5).
Phylogenetic trees based on rRNA sequences show that bacterial diversity is represented by natural relatedness groups, the phylogenetic divisions (51). About 36 such divisions are currently identifiable. The final extent of division-level diversity in the bacterial domain is still unknown but clearly will be more than 40 divisions. Culture-independent studies have resulted in multiple hits on the majority of described divisions in different habitat types (Table (Table1),1), suggesting that the final number of divisions will be within the same order of magnitude as the present estimate.
The molecular analyses of environmental DNA have revealed substantial phylogenetic diversity with little or no representation among organisms previously studied. Because of their abundance and wide distribution, some of the organisms represented by the sequences likely contribute significantly to the global chemical cycles. Descriptions of newly identified, but apparently important, bacterial divisions such as the Acidobacterium and Verrucomicrobia, are presently confounded by too few cultivated representatives and only rudimentary descriptions of the strains. Cultivation efforts need to be directed at new representatives of the diverse groups for further study. Continued work to sequence the 16S rDNAs of all deposited type cultures (<50% sequenced to date ) may also result in detection of additional cultivated representatives of newly described divisions. It is a challenge to microbial biologists to determine the physiological diversity and environmental roles of these recently articulated divisions of Bacteria.
The phylogenetic differences between the bacterial divisions probably are reflected in substantial physiological differences. Some properties, the general properties of Bacteria, are expected to be distributed among all the divisions. Division-specific novelties are known as well, for instance, endospore formation by the low-G+C gram-positive bacteria or axial filaments (endoflagella) in the spirochetes. Some biochemical properties evidently have transferred laterally among the divisions. For example, the two types of photosynthetic complexes, photosystem I (PSI) and PSII, are each distributed sporadically among the divisions, consistent with lateral transfer (3). Lateral transfer may also have resulted in combinatorial novelty among the divisions; PSI and PSII, for instance, apparently came together in the cyanobacteria to create oxygenic photosynthesis, with profound consequences to the biosphere (3). Many more such division-specific qualities and cooperations should become evident at the molecular level as comparative genomics gives us a sharper phylogenetic picture of bacterial diversity.
We thank John A. Fuerst for providing useful comments on the manuscript and Pascale Durand, Floyd Dewhirst, and Bruce Paster for supplying unreleased sequences. We also thank Michael Tanner and Scott Dawson for assistance in establishing the website of additional information.
The authors’ research is supported by grants to N.R.P. from the National Institutes of Health and Department of Energy.