The human genome encodes less than 20 enzymes for the digestion of complex carbohydrates, mostly plant reserve carbohydrates (sucrose and starch) and lactose. However, the cell walls of plants represent an enormous nutrient source, yet highly variable in terms of amount, diversity and botanical source 
. These carbohydrates are chemically and structurally highly complex, and are arranged in a three-dimensional network that has evolved to be intrinsically resistant to enzymatic breakdown 
. Thus high molecular weight crystalline cellulose microfibrils are inter-twinned with hemicelluloses and pectins, which are a whole range of homo- and heteropolysaccharides composed of dozens of different monosaccharide units linked in a multitude of ways. Ester substituents or non-carbohydrate polymers, such as lignin, proteins, cutin and suberin, add a further layer of complexity. As a result, a single vegetable contains hundreds of different bonds that need to be cleaved in order to unlock the assimilable carbon of the cell wall constituents. Considering the considerable variations (in composition and in microscopic structure) in the cell walls of the vegetables and fruits in the human diet, the digestive enzymes face a huge number of different substrates. Since these enzymes are absent from the human genome, humans rely on the microbiota inhabiting the digestive track to utilize these complex plant polysaccharides 
. The microbiota must adapt rapidly to environmental cues to determine which enzymes are necessary to metabolize the plant cell wall structures in each meal.
Digestion starts in the oral cavity, where this study suggests that microbes have a hitherto underestimated large range of enzymes to initiate plant polysaccharide breakdown as indicated by the presence of cellulases (GH6), hemicellulases (GH26) and pectin hydrolases (GH28 and GH43). Additionally, microbes in the oral cavity also initiate the processing of ‘easy’ plant carbohydrates, such as sucrose and starch, which can be converted into biofilms (dextran, fructans) that secure long-term residence to the bacteria in the oral sphere. When we compare the oral sites to the gut, the starch and glycogen utilization appears much reduced in stool, suggesting specialization in digestion, whereby most of these sugars are likely degraded by human salivary amylase and oral flora and taken up in the small intestine by the host 
. The starch molecules that do reach the distal gut are particularly difficult to hydrolyze and are known as “resistant” starch 
and are considered a component of dietary fibers.
Upon examination of the mechanisms important for plant cell-wall degradation, we found that non-reducing end acting cellobiohydrolases (CAZy family GH6) appear specific to the oral cavity while reducing end acting cellobiohydrolases (family GH48) are specific to the gut. The two types of cellobiohydrolases have been found to digest cellulose in a synergistic manner when acting together or when acting with endoglucanases 
. The current paradigm is that cellobiohydrolases are the essential cellulases because they can deliver soluble cellobiose from polymeric cellulose in a single step, and the role of the endoglucanases is to provide chain ends to the cellobiohydrolases 
. The GH6 genes found in the oral cavity, were highly prevalent in supragingival plaques samples and are primarily attributed to the genera Capnocytophaga
. The GH48 genes, found in stool, are mostly in unknown taxon, although likely in some species of the Firmicutes
. However, there is no reference genome in this genus with annotated GH48 genes. Thus far, the GH6 family of enzymes has not been identified in any animal gut sample 
, suggesting that this enzyme specificity is perhaps driven by environmental factors.
Based on this study, digestion in the gut appears highly specialized for the digestion of complex carbohydrates. Since the other body sites are unlikely to be exposed to plant carbohydrates for a significant length/amount, the plant carbohydrate utilization is likely the most prominent factor to explain the great divide observed in the WSG metagenomic data between the digestive tract and the other body sites. In the gut, the proportion of genes that hydrolyze plant cell wall is greater than the genes that hydrolyze animal carbohydrates, probably reflecting the greater carbohydrate diversity elaborated by plants compared to that of animals. In almost every carbohydrate category, the gut microbiota has the highest ability to degrade these carbohydrates. The distal gut appears to have all the necessary enzymes for plant polysaccharide digestion with the puzzling exception of GH6 cellulases, suggesting that these enzymes have not been selected to breakdown cellulose substrates by anaerobic animal gut bacteria, while they are common in soil bacteria and fungi that decay plant cell walls. In conclusion, two major trends emerge. First the functional profile of the collective microbial community is more similar within a body site than between sites, despite variation of taxonomic profiles. This means that there is a specialization of the flora at each body site and that it is clearly detectable by metagenomic sequencing, suggesting that metagenomic sequencing is able to trace the functional adaptation to the carbohydrates that prevail in a given body site. Second, while broad predictions of the global number of CAZymes could possibly be made due to the overwhelming number of GHs and PLs encoded by Bacteroidetes compared to Firmicutes, the present results show that we are unable to predict the actual CAZyme profile at each body site due to an insufficient number of reference genomes. However, without a better knowledge of the precise substrate specificity of the enzyme families showing expansion/reduction, there seems to be little correlation between the functional capability and taxonomic family. These results suggest the exact functional profile of CAZymes by body site is not currently predictable given genera abundances. Whilst the current efforts aimed at sequencing more reference genomes will sooner or later allow a finer prediction, the precise functional CAZyme profiling will also require coupling metagenomic analyses to structural genomics initiatives and to high-throughput biochemical and other functional assays of CAZymes