|Home | About | Journals | Submit | Contact Us | Français|
The complete DNA sequence of the aerobic cellulolytic soil bacterium Cytophaga hutchinsonii, which belongs to the phylum Bacteroidetes, is presented. The genome consists of a single, circular, 4.43-Mb chromosome containing 3,790 open reading frames, 1,986 of which have been assigned a tentative function. Two of the most striking characteristics of C. hutchinsonii are its rapid gliding motility over surfaces and its contact-dependent digestion of crystalline cellulose. The mechanism of C. hutchinsonii motility is not known, but its genome contains homologs for each of the gld genes that are required for gliding of the distantly related bacteroidete Flavobacterium johnsoniae. Cytophaga-Flavobacterium gliding appears to be novel and does not involve well-studied motility organelles such as flagella or type IV pili. Many genes thought to encode proteins involved in cellulose utilization were identified. These include candidate endo-β-1,4-glucanases and β-glucosidases. Surprisingly, obvious homologs of known cellobiohydrolases were not detected. Since such enzymes are needed for efficient cellulose digestion by well-studied cellulolytic bacteria, C. hutchinsonii either has novel cellobiohydrolases or has an unusual method of cellulose utilization. Genes encoding proteins with cohesin domains, which are characteristic of cellulosomes, were absent, but many proteins predicted to be involved in polysaccharide utilization had putative D5 domains, which are thought to be involved in anchoring proteins to the cell surface.
Cellulose is probably the most abundant biopolymer on earth (5). The β-1,4-linked glucose chains of cellulose form highly ordered crystalline fibrils that are insoluble and relatively recalcitrant to degradation. Cellulolytic bacteria produce a suite of enzymes that act synergistically to convert crystalline cellulose to glucose (38, 62). The cellulolytic bacteria that have been studied most use either of two strategies to digest cellulose. Some, such as aerobic actinomycetes belonging to the genera Streptomyces and Thermobifida, secrete soluble extracellular cellulolytic enzymes that usually contain carbohydrate binding modules (CBMs) that anchor them to the substrate (71). These enzymes attack the insoluble substrate and release glucose, cellobiose, and short oligomers that are taken up and metabolized by the cells. In contrast, anaerobic bacteria of the genera Clostridium and Ruminococcus produce multiprotein complexes, called cellulosomes, that often remain attached to cells (7, 9, 19, 50, 62). Cellulosomes contain cellulose-binding proteins, enzymes involved in the hydrolysis of cellulose and of other polysaccharides, and scaffolding proteins that hold the multiprotein complex together. Cellulosomes often anchor the bacteria to their substrate and result in localized release of sugars that are taken up by the cells.
Cytophaga hutchinsonii (Fig. (Fig.1),1), the type species of the genus Cytophaga, is an abundant aerobic cellulolytic soil bacterium (37, 55, 64, 72). C. hutchinsonii is a gram-negative bacterium that belongs to the phylum Bacteroidetes (also known as the Cytophaga-Flavobacterium-Bacteroides group). C. hutchinsonii utilizes very few substrates as sole carbon and energy sources. Besides cellulose, its only known substrates are cellobiose and glucose (37). Wild strains generally use these soluble sugars poorly and exhibit a preference for crystalline cellulose. C. hutchinsonii requires direct contact with cellulose for efficient digestion, and most of the cellulolytic enzymes appear to be cell associated (16, 47, 64). Another unusual feature of cellulose degradation by C. hutchinsonii is that reducing sugars such as glucose and cellobiose do not accumulate in the medium when it digests cellulose (64). This is probably not simply the result of efficient uptake and metabolism of these sugars, since incubation of cells under anaerobic conditions, which should interfere with these processes, does not result in accumulation of reducing sugars.
C. hutchinsonii was selected for genome sequencing for several reasons. First, biochemical and physiological studies suggested that it might use a novel strategy for cellulose utilization. Second, techniques to genetically manipulate C. hutchinsonii are available (44). Most cellulolytic bacteria have resisted genetic analysis, so the development of these techniques promises new insights into bacterial cellulose utilization. Third, C. hutchinsonii exhibits a form of rapid gliding motility whose mechanism remains unexplained despite decades of study (43, 64, 72). Gliding may help to facilitate cellulose digestion, since gliding cells align themselves with and move along cellulose fibers as they digest them (64, 72). Finally, few genome sequences have been reported for members of the phylum Bacteroidetes, and none are closely related to C. hutchinsonii. This paper highlights novel features of the C. hutchinsonii genome, with particular emphasis on genes and proteins likely to be involved in cellulose utilization and gliding motility.
The random shotgun method was used to sequence the genome of C. hutchinsonii ATCC 33406. Large (40 kb)-, medium (8 kb)-, and small (3 kb)-insert random libraries were partially sequenced, with an average success rate of 90% and an average high-quality read length of 634 nucleotides. Sequences were assembled with parallel phrap (High Performance Software, LLC). Possible misassemblies were corrected with Dupfinisher (C. F. Han and P. Chain, presented at the 2006 International Conference on Bioinformatics and Computational Biology, Las Vegas, NV, 26 to 29 June 2006) or by analysis of transposon insertions in bridge clones. Gaps between contigs were closed by editing, custom primer walking, or PCR amplification. The completed genome sequence of C. hutchinsonii contains 105,896 reads, achieving an average of 15-fold sequence coverage per base, with an error rate of <1 in 100,000.
Gene predictions were obtained using Glimmer, and tRNAs were identified using tRNAScan-SE. Basic analyses of the gene predictions were performed by comparing coding sequences against the Pfam, BLOCKS, and Prodom databases. Protein localizations were predicted with PSORTb (21), and lipoproteins were identified using LipoP (33). A team of annotators added gene definitions and functional classes, using BLAST results and information from the Pfam (63; http://pfam.janelia.org/index.html), BLOCKS (26), Prodom (59), and SMART (58) databases. Metabolic pathways were constructed using MetaCyc as a reference data set (14). Genes encoding candidate glycoside hydrolases were detected with routines used for updates of the Carbohydrate Active Enzymes (CAZY) database (18) at http://www.cazy.org/CAZY/. Detailed information about the genome properties and genome annotation can be obtained from the JGI Integrated Microbial Genomes website (40) at http://img.jgi.doe.gov/pub/main.cgi.
To obtain a list of orthologs from bacteroidete genomes, a perl script that determines bidirectional best hits was written. Genes g and h are considered orthologs if h is the best BLASTP hit for g and vice versa, with E values of ≤10−15. A gene is considered strain specific if it has no hits with an E value of 10−5 or less.
Prealigned 16S rRNA sequences were derived from the Ribosomal Database Project site (17; http://rdp.cme.msu.edu/index.jsp). Manual alignment adjustments were made as needed, with the assistance of the BioEdit multiple alignment tool of Hall (http://www.mbio.ncsu.edu/BioEdit/bioedit.html). The refined multiple alignment was used as input for the generation of a phylogenetic tree, using the program package PHYLIP (20). The fastDNAml (52) and nucleic acid sequence maximum likelihood (51) methods with default settings were employed to obtain character-based trees. Both the fastDNAml and maximum likelihood trees yielded similar clusters and arrangements of taxa within them.
Cells for scanning electron microscopy were grown on Whatman filter paper on DMA agar as previously described (44). Cells were fixed with 2.5% glutaraldehyde, 0.1% OsO4 in 10 mM HEPES buffer, pH 7.2, for 30 min at 25°C. Fixed cells were washed twice in 10 mM HEPES buffer, postfixed in 1% OsO4, and washed twice with distilled water. Cells were dehydrated with ethanol and critical point dried with CO2. Samples were mounted, sputter coated with 60% gold-40% palladium, and viewed with a Hitachi S-570 scanning electron microscope.
The sequences of C. hutchinsonii can be accessed using GenBank accession number CP000383.
The genome of C. hutchinsonii consists of a single, circular, 4,433,218-bp chromosome that has a GC content of 38.85% (see Fig. S1 in the supplemental material). A total of 3,790 protein-encoding genes are predicted, 1,986 of which have been assigned a tentative function. The GC skew allowed prediction of the site of the origin of replication to near nucleotide position 4,038,464. Surprisingly, ribosomal genes, tRNA genes, and typical replication-related genes, such as dnaA, dnaN, recF, and gyrA, which are often located near the origin of replication in bacteria, were not located near this site. Three rRNA operons were identified in the genome. Analysis of the 16S rRNA sequences confirmed that C. hutchinsonii belongs to the phylum Bacteroidetes (Fig. (Fig.2).2). It is a member of the class Sphingobacteria and is only distantly related to well-studied bacteroidetes, such as members of the genera Bacteroides, Porphyromonas, Prevotella (all from class Bacteroidetes), and Flavobacterium (class Flavobacteria).
One of the most distinctive features of C. hutchinsonii is its ability to rapidly digest crystalline cellulose. Cellulose utilization by most well-studied microorganisms involves the activities of multiple enzymes, including endo-β-1,4-glucanases, cellobiohydrolases (also called exo-β-1,4-glucanases), and β-glucosidases, that act synergistically to convert crystalline cellulose to glucose (39, 70) (Fig. (Fig.3).3). Endo-β-1,4-glucanases attack the cellulose chain at exposed internal glycosidic bonds. They nick the long polymeric substrate and provide sites of attack for cellobiohydrolases, which release cellobiose from the exposed ends. Members of one class of cellobiohydrolases attack the nonreducing end of the polymer, whereas members of the second class attack the reducing end (4). β-Glucosidases hydrolyze cellobiose and short cellodextrin oligomers to glucose. Analysis of the amino acid sequences of cellulolytic enzymes allows them to be grouped into a limited number of glycoside hydrolase families (27-29). Examination of the C. hutchinsonii genome identified many genes encoding possible cellulolytic enzymes. These include possible endo-β-1,4-glucanases belonging to glycoside hydrolase families GH5 and GH9 and β-glucosidases belonging to GH3 (Table (Table1).1). Genes encoding proteins with similarity to known cellobiohydrolases, which are typically members of family GH6 or GH48 in bacteria (23, 71), were not detected. Since cellobiohydrolases are needed for efficient cellulose utilization by other organisms, C. hutchinsonii either has novel cellobiohydrolases or uses an unusual method of cellulose utilization that does not require such enzymes. Some GH9 endo-β-1,4-glucanases are processive (22). The presence of processive endoglucanases could account for the ability of C. hutchinsonii to digest crystalline cellulose in the apparent absence of cellobiohydrolases. However, none of the C. hutchinsonii enzymes display high levels of similarity to experimentally characterized processive endoglucanases, so further experimental evidence is needed to determine this possibility. The apparent absence of cellobiohydrolases may partially explain why reducing sugars do not accumulate in the medium when cells of C. hutchinsonii digest cellulose (64). All four of the candidate β-glucosidases are predicted to reside in the periplasm, and three of them are predicted lipoproteins (Table (Table1).1). The periplasmic location of these enzymes may also limit the amount of reducing sugars released beyond the confines of the cell. In addition to enzymes involved in cellulose utilization, C. hutchinsonii has genes predicted to encode glycohydrolases involved in the digestion of hemicelluloses, such as xylan (Table (Table2),2), and genes encoding β-glycosidases of uncertain specificity (Table (Table3).3). The remaining predicted glycohydrolases (Table (Table4)4) are likely to be involved in peptidoglycan turnover and utilization of intracellular storage compounds. Genes encoding predicted pectate lyases (Table (Table5)5) and genes encoding polysaccharide esterases, some of which are predicted to be involved in xylan utilization (Table (Table6),6), were also identified. C. hutchinsonii does not grow on xylan or pectin as a sole carbon and energy source but may use xylanases, pectate lyases, and polysaccharide esterases to gain access to cellulose in plant material. A similar situation has been observed for the cellulolytic anaerobes Clostridium thermocellum and Clostridium cellulovorans (6).
Recognizable CBMs are found on many of the glycohydrolases (Tables (Tables22 and and3),3), but surprisingly, none of the proteins related to endo-β-1,4-glucanases contain such a module (Table (Table1).1). The predicted endo-β-1,4-glucanases may have novel CBMs or may interact with other CBM-containing proteins to form multiprotein complexes that function in cellulose digestion. Equally surprising is that no genes encoding proteins with type A CBMs, the only ones known to bind to crystalline cellulose (11), were detected in the entire genome. Cells of C. hutchinsonii do bind to crystalline cellulose and actively glide along the fibers (Fig. (Fig.1).1). Presumably, components of the motility machinery are involved in these interactions. The attachment of cells and movement along the cellulose fibers may make dedicated CBM domains on the cellulolytic enzymes unnecessary. If type A CBMs were present, they might even prevent gliding and thus decrease the ability of the cells to access and digest cellulose.
Many of the C. hutchinsonii proteins with possible roles in polysaccharide utilization are large and have complex multidomain structures (Tables (Tables1,1, ,2,2, ,3,3, ,5,5, and and7).7). In addition to the glycohydrolase and CBM domains discussed above, other domains present on some of the predicted proteins include domains with immunoglobulin-like folds, including group 2 bacterial immunoglobulin-like domains (BIG_2; Pfam number PF02368), N-terminal immunoglobulin-like domains of cellulase (CelD_N; Pfam number PF02927), immunoglobulin I-set domains (IG; Pfam number PF00047), fibronectin type 3-like domains (FN3; Pfam number PF00041), and polycystic kidney disease protein-like domains (PKD; Pfam number PF000801). Parallel β helix repeat domains (PbH1; SMART database number SM00710) and novel conserved domains (X1, X3, X100, and Y98) were also identified. The functions of these domains are not known, but possibilities include binding to cellulose, binding to other components of the cellulolytic machinery, or interaction with the cell surface. Eighteen of the proteins listed in Tables Tables1,1, ,2,2, ,3,3, ,5,5, ,6,6, and and77 have carboxy-terminal domains that are similar to the D5 domains of Rhodothermus marinus cell-associated xylanolytic enzymes (34). The R. marinus D5 domains have been postulated to be involved in attachment to the cell surface. Carboxy-terminal D5 domains are also found on 39 additional C. hutchinsonii proteins (Fig. (Fig.4).4). These may also be part of the cell-bound polysaccharide utilization machinery of C. hutchinsonii or may have other functions. Experimental evidence indicates that C. hutchinsonii requires direct contact with cellulose for efficient digestion and that its cellulolytic enzymes are primarily cell associated (47, 64), and some of the domains described above may be involved in this localization. Cellulolytic members of the gram-positive genera Clostridium and Ruminococcus also produce cell-associated cellulases. These bacteria have multiprotein cellulolytic complexes, called cellulosomes, composed of scaffoldin proteins with multiple cohesin domains that bind to complementary dockerin domains on the cellulolytic enzymes (6). The high-affinity cohesin-dockerin interactions result in the formation of the large cell surface cellulosomal structures. C. hutchinsonii does not encode obvious scaffoldin proteins or proteins with recognizable cohesin or dockerin domains but may have cellulosome-like structures composed of different cellulolytic multienzyme complexes.
Other predicted cell-associated proteins that may be involved in cellulose utilization include homologs of the Bacteroides thetaiotaomicron outer membrane starch utilization proteins SusC and SusD. B. thetaiotaomicron is an anaerobic inhabitant of the human large intestine and is a distant relative of C. hutchinsonii. It does not digest cellulose but does utilize starch. Like those of C. hutchinsonii, the B. thetaiotaomicron polysaccharide-degrading enzymes are primarily cell associated (2, 57). Several genes that are required for starch utilization have been identified, and the encoded proteins have been characterized (54, 57, 60, 61). SusC and SusD are involved in binding starch on the cell surface and in transport of oligomers across the outer membrane. Homologs of SusC and SusD are common in the phylum Bacteroidetes and include the cell surface proteins RagA and RagB, respectively, of Porphyromonas gingivalis (24). Analysis of the C. hutchinsonii genome revealed the presence of two genes related to susC (CHU_0546 and CHU_0553) and two genes related to susD (CHU_0547 and CHU_0554). Since C. hutchinsonii does not utilize extracellular starch and the only polysaccharide that it is known to use is cellulose, we speculate that the proteins encoded by these genes may be involved in binding and utilization of cellulose or products of cellulose digestion.
C. hutchinsonii has 184 genes predicted to be involved in transport. Of particular interest are genes potentially involved in transport of the products of cellulose hydrolysis. Transporters of the phosphoenol-pyruvate:sugar phosphotransferase system are absent from C. hutchinsonii, but six genes encoding putative sugar transport permeases of the major facilitator superfamily are present (CHU_0773, CHU_0325, CHU_0960, CHU_1068, CHU_1656, and CHU_3606). The predicted transport proteins are members of the fucose:H+ symporter family, which has well-characterized members that transport fucose, glucose, and galactose, among other sugars. These proteins may be involved in transport of the products released during cellulose digestion. Another transporter that may play a role in cellulose utilization is the putative gluconate transporter encoded by CHU_1403. The remaining small-molecule transporters are predicted to function in transport of amino acids and inorganic ions across the cytoplasmic membrane and in cell envelope biogenesis. Genes encoding ExbB, ExbD, 4 TonB-like proteins, 15 TonB-dependent outer membrane receptors, and a TolC-like protein were also identified and may play roles in transport across the outer membrane.
Translocation of proteins across the cytoplasmic membrane is apparently mediated by SecA, SecE, SecY, SecDF, SecG, YidC, and YajC of the Sec system and by TatA and TatC of the twin-arginine transport system. The major components involved in type II secretion (GspD, GspE, GspF, and GspG) are also present and, presumably, mediate protein translocation across the outer membrane. Possible ABC transporters that may transport specific proteins (type I transport) and autotransporters that may facilitate transport of themselves or other proteins across the outer membrane (type V transport) were also identified. The C. hutchinsonii genome encodes 194 predicted lipoproteins, and genes encoding machinery for their processing (lgt, lspA, and lnt) and translocation (lolA, lolD, and lolE) are also present.
C. hutchinsonii carries out aerobic respiration of glucose to CO2 (47, 64), and genome analysis indicated the presence of a complete Embden-Meyerhof-Parnas pathway and tricarboxylic acid cycle. Genes encoding each of the NADH dehydrogenase subunits, cytochrome c, cytochrome c oxidase, and components of ATP synthase were also present, accounting for the ability of C. hutchinsonii to carry out aerobic respiration of glucose. C. hutchinsonii also may have the ability to import and metabolize gluconate. In addition to the apparent gluconate transporter mentioned above, the protein encoded by CHU_1404 is predicted to exhibit gluconokinase and 6-phosphogluconate dehydrogenase activities and is likely to be involved in gluconate metabolism (see Fig. S2 in the supplemental material). As indicated above, C. hutchinsonii appears to carry enzymes involved in digestion of xylan but does not use xylan as a sole source of carbon and energy. C. hutchinsonii also fails to grow on common components of xylan, such as xylose and arabinose (37, 64), and analysis of the genome revealed the apparent absence of transporters for these sugars and of key enzymes needed for their metabolism, including arabinose isomerase, xylose isomerase, ribulokinase, and xylulokinase. Transporters and enzymes involved in the utilization of galacturonic acid were also apparently lacking, explaining the inability to grow on pectin despite the presence of genes encoding candidate pectate lyases.
C. hutchinsonii grows on minimal media with filter paper cellulose or glucose as the sole carbon and energy source and thus has the ability to synthesize all of its organic components from these substrates. As expected, analysis of the genome sequence revealed genes encoding biosynthetic enzymes needed to synthesize the 20 common amino acids and to synthesize nucleotides, fatty acids, and vitamins and coenzymes, such as biotin, folic acid, lipoic acid, nicotinic acid, pantothenic acid, pyradoxine, riboflavin, and thiamine. Genes encoding enzymes for the synthesis of heme and menaquinone were also identified.
C. hutchinsonii carries a variety of proteins that are predicted to regulate gene expression in response to external or internal stimuli. Sigma factors include an RpoD (σ70) homolog, a σ54 homolog, and 19 sigma factors belonging to the extracytoplasmic function (ECF) subfamily. C. hutchinsonii RpoD is similar in size (32.7 kDa) and sequence to other bacteroidete RpoD proteins and is much smaller than Escherichia coli σ70. Like the other bacteroidete RpoD proteins, C. hutchinsonii RpoD lacks regions found in most nonbacteroidete RpoD proteins, such as the N-terminal region 1.1 and the segment between regions 1.2 and 2.1 (68). The novel structure of bacteroidete RpoD sigma factors may account for some of the unusual features of bacteroidete promoters that have been studied (8). The presence of large numbers of ECF sigma factors is a common property of members of the phylum Bacteroidetes (35). ECF sigma factors in other bacteria regulate the expression of genes with extracytoplasmic function. C. hutchinsonii has 53 predicted histidine kinase two-component signal transduction proteins, 34 predicted response regulatory proteins, and 7 proteins that contain both histidine kinase and response regulatory domains. C. hutchinsonii is predicted to carry 17 cyclic nucleotide-binding proteins and probably uses cyclic AMP to regulate gene expression and enzymatic activities. In contrast, it does not appear to use the common bacterial signaling molecule cyclic di-GMP. Most bacteria carry proteins with GGDEF domains, involved in the synthesis of cyclic di-GMP, and proteins with EAL domains, thought to function as phosphodiesterases for cyclic di-GMP turnover (56). Analysis of the C. hutchinsonii genome indicated the complete absence of genes encoding proteins with these domains. GGDEF and EAL domain proteins were also absent from the other members of the phylum Bacteroidetes with complete genome sequence data that we analyzed, including B. thetaiotaomicron, Bacteroides fragilis YCH46, Porphyromonas gingivalis, Prevotella intermedia, and Tannerella forsythensis.
C. hutchinsonii cells cannot swim in liquid, but they attach to and move along cellulose fibers as they digest them (64) (Fig. (Fig.1).1). The mechanism of this gliding motility is not known. Gliding cells of C. hutchinsonii move at speeds of up to 5 μm/second over glass surfaces and occasionally reverse their direction of movement. Cells also attach to the glass by one pole and rotate in place at frequencies of about 2 revolutions per second. The ability to glide over surfaces may provide C. hutchinsonii cells with a selective advantage for utilization of insoluble cellulose. Gliding motility may allow cells to migrate along cellulose fibers in search of regions that are most amenable to enzymatic attack and to penetrate deep into the cellulose matrix as the substrate is digested. In some other bacteria, flagella and type IV pili allow cells to move over surfaces (25, 41). Electron microscopic analyses have failed to identify these organelles on cells of C. hutchinsonii, and analysis of the genome also failed to identify genes encoding critical components of flagella and type IV pili, suggesting that C. hutchinsonii gliding motility relies on other machinery.
Gliding motility is characteristic of many members of the phylum Bacteroidetes (42). The mechanism of gliding motility has been studied extensively for one bacteroidete, Flavobacterium johnsoniae, which is distantly related to C. hutchinsonii (Fig. (Fig.2).2). Sixteen genes (gldA, gldB, gldD, gldF, gldG, gldH, gldI, gldJ, gldK, gldL, gldM, gldN, sprA, sprB, sprC, and sprD) involved in F. johnsoniae gliding motility have been identified (1, 12, 13, 30-32, 45, 46). gldA, gldF, and gldG are thought to encode components of an ATP-binding cassette transporter that is required for gliding (1, 30). The functions of the other motility proteins are less certain. In a current model of F. johnsoniae gliding, the Gld proteins comprise the gliding motor, which propels cell surface adhesins comprised of Spr proteins. C. hutchinsonii has homologues of each of the gld and spr genes, and these are likely to be involved in cell movement. Disruption of C. hutchinsonii gldG by homologous recombination with a plasmid containing an internal fragment of the gene resulted in a loss of motility, implicating C. hutchinsonii gldG in gliding (C. Guntur and M. J. McBride, unpublished data).
Gliding motility is not confined to the bacteroidetes, and recent evidence indicates that there are multiple, genetically unrelated mechanisms for bacterial gliding (42, 43). Myxococcus xanthus has two gliding motility systems. The S motility system relies on extension and retraction of type IV pili to pull cells, whereas the mechanism of the A motility system is not yet certain (48, 65, 69, 73). Mycoplasma mobile appears to use a third gliding motility system composed of cytoskeletal proteins and large cell surface proteins (49). Most F. johnsoniae and C. hutchinsonii gld genes have no obvious homologs in the genomes of M. xanthus and M. mobile (12), suggesting that Cytophaga-Flavobacterium gliding motility is distinct from myxobacterial or mycoplasma gliding.
F. johnsoniae does not digest cellulose, but it does digest another insoluble polysaccharide, chitin. Surprisingly, all mutations that eliminate gliding also eliminate the ability of F. johnsoniae to digest chitin (12, 13, 15, 45, 46). The connection between motility and chitin digestion is not understood. Perhaps cells glide along the insoluble polymer until they reach regions amenable to digestion. Alternatively, the motility machinery, which can move large particles over the cell surface (53), may be needed to move oligomers to sites where they can be digested further or taken into the cell. The role, if any, of motility in C. hutchinsonii cellulose digestion remains to be explored. The regular arrangement of gliding cells on cellulose fibers (Fig. (Fig.1)1) suggests that motility may position the cell-associated cellulolytic machinery so that digestion of the insoluble polymer is optimized.
All motile bacteria that have been studied control their motility apparatus to move in a favorable direction. This typically involves a complex signal transduction system, such as the E. coli chemotaxis system, that senses the environment and affects the functioning of the motility apparatus. C. hutchinsonii presumably controls its motility machinery in response to external stimuli, and it has genes encoding proteins that are similar to E. coli CheY (CHU_2082), CheB (CHU_2078), and CheR (CHU_2079). These genes are clustered together on the genome, supporting the idea that their products may function together. C. hutchinsonii CheB is unusual in that it lacks the N-terminal response regulatory domain that is found in most CheB proteins. Homologs of cheB and cheR are also found in other bacteroidetes that exhibit gliding motility, such as F. johnsoniae and Tenacibaculum sp. strain MED152, but are not found in the nonmotile bacteroidetes B. thetaiotaomicron BPI5482, B. fragilis YCH46 or NCTC9343, P. gingivalis W83, P. intermedia 17, and T. forsythensis. C. hutchinsonii has another gene, CHU_1237, that encodes a protein with similarity to both CheB and CheR. This protein has a CheB-like domain at the amino terminus followed by a CheR-like domain, a PAS domain, and a histidine kinase domain at the carboxy terminus. The CheB-like domain lacks the response regulator domain that is present in most CheB proteins. Genes encoding other expected components of a bacterial chemotaxis system, such as homologs of methyl-accepting chemotaxis proteins (MCPs) and of CheA, CheW, and CheZ, are apparently lacking. The absence of obvious MCPs is surprising since CheB (methyltransferase) and CheR (methylesterase) are expected to modify MCPs. The C. hutchinsonii CheB and CheR homologs may modify novel chemotaxis proteins or may have roles unrelated to motility and chemotaxis. The region surrounding C. hutchinsonii cheB, cheR, and cheY contains several genes whose products are likely to be involved in signal transduction. These include an apparent histidine kinase with a PAS domain (CHU_2084), two proteins that each contain histidine kinase and response regulatory domains (CHU_2077 and CHU_2081), and a large protein (CHU_2080) that contains a CHASE domain, a GAF domain, a histidine kinase domain, and a response regulatory domain. PAS, CHASE, and GAF domains are commonly found in sensory or signal transduction proteins (3, 66, 74) and could have roles in C. hutchinsonii tactic responses.
The complete genome sequences of five members of the phylum Bacteroidetes, namely, C. hutchinsonii, B. thetaiotaomicron BPI5482, B. fragilis, P. gingivalis W83, and P. intermedia 17, were analyzed. B. thetaiotaomicron, B. fragilis, P. gingivalis, and P. intermedia are nonmotile anaerobic members of the class Bacteroidetes, and each typically inhabits animal digestive tracts. C. hutchinsonii is an aerobic bacterium of the class Sphingobacteria that is commonly found in soil. Given these differences in phylogeny and lifestyle, it is not surprising that there are many features of the C. hutchinsonii genome that set it apart from the others. Direct comparison of the complete nucleotide sequences, using NUCmer and dotplot (36) analyses, revealed no recognizable synteny between C. hutchinsonii and the other bacteroidetes. Five hundred fifty-five core bacteroidete genes (requiring bidirectional best hits as a minimum criterion) were shared between the five organisms tested, i.e., C. hutchinsonii, B. thetaiotaomicron, P. gingivalis, P. intermedia, and T. forsynthensis. C. hutchinsonii, unlike the other bacteria, appears to have a complete tricarboxylic acid cycle and a complete respiratory electron transport chain, as expected from its demonstrated ability to carry out aerobic respiration. Other differences are noted in Table S1 and Fig. S2 in the supplemental material.
The complete genome sequence of C. hutchinsonii provides a window on the inner workings of this common cellulolytic soil bacterium. The mechanism of cellulose utilization employed by cells of C. hutchinsonii appears to differ from that employed by cellulosome-containing anaerobic clostridia (7, 9) or by aerobic cellulolytic bacteria, such as Thermobifida fusca and Saccharophagus degradans, which employ soluble cellulases (67, 71). The apparent absence of cellobiohydrolases and type A CBMs also sets C. hutchinsonii apart from most well-studied cellulolytic microorganisms. We hypothesize that C. hutchinsonii uses cell surface endoglucanases to attack the insoluble cellulose fibers. Soluble oligomers may be transported actively across the outer membrane with the assistance of SusC-like and SusD-like proteins, with further digestion occurring in the periplasm and cytoplasm. An efficient surface motility system allows the bacterium to attach to and migrate along its insoluble substrate as it digests it. The genome sequence data and the availability of techniques to genetically manipulate C. hutchinsonii will allow experiments to probe the cellulolytic and motility systems of this common but poorly understood bacterium.
This work was supported by the U.S. Department of Energy under contract no. W-7405-ENG-36 and by a UWM-RGI award to M.M.
We thank M. Burmeister and H. Owen for assistance with scanning electron microscopy and E. Leadbetter for constructive comments on the manuscript.
Published ahead of print on 30 March 2007.
†Supplemental material for this article may be found at http://aem.asm.org/.