|Home | About | Journals | Submit | Contact Us | Français|
All mucins are highly O-glycosylated by variable glycans depending on species, histo-blood group and organ. This makes the intestinal main mucin MUC2 non-degradable by the host digestive system, but well by both commensal and pathogenic bacteria. The MUC2 glycans are important for selection of the commensal bacteria and act as a nutritional source for the bacteria; this also helps the host to recover some of the energy spent on constantly renewing the protective mucus layer.
Glycosylation is the most diverse and common post-translational modification of cell surfaces and secreted proteins. N-glycosylation is most well studied and predictable, whereas O-glycosylation is more diverse and less well understood. O-glycosylation is also often called mucin-type glycosylation as it is typical for mucins that often have more than 80% of the mass as O-glycans. This review will discuss the mucin-type O-glycosylation and especially the O-glycosylation of human and mice intestinal mucin MUC2 in relation to bacteria and disease.
The two hydroxyl-amino acids, serine and threonine, are the attachment sites for the mucin O-glycans. The initiation of O-glycosylation is made by one of the twenty peptidyl-GalNAc transferases in the human genome that utilize UDP-GalNAc to add GalNAc in an alfa-configuration . Each of these 20 transferases have their specificities when it comes to peptide or glycopeptide sequence. Some of these enzymes are more general in their preferences of glycosylation site whereas others are more specific . O-glycosylation is normally initiated in the Golgi apparatus and requires that the Ser/Thr are exposed on the folded protein. The amino acid sequence to be glycosylated cannot be well-folded and typically the amino acid Pro is found close to the attachment site. The 20 different transferase specificities and the restriction to exposed and less folded regions make O-glycosylation prediction from only the amino acid sequence unreliable even if such tools have been developed . The initiating GalNAc is extended by the addition of Gal, GlcNAc or GalNAc on the 3- and/or 6-hydroxyl groups giving primarily the Core1, 2, 3, and 4 structures [3–6]. These are further extended by Gal and GlcNAc and often terminated by sialic acid, sulfate group, GalNAc and/or Fuc. The expressed glycosyltransferases in the specific cell will determine the O-glycan outcome and thus the same protein sequence can have highly different glycans. In addition to the transferases present, the transferase localization within the Golgi stacks will affect the resulting glycosylation . In the secretory pathway of the cell there is a declining pH gradient from the endoplasmic reticulum to the trans-Golgi network that is important for the glycosyltransferase localization. Neutralization of this gradient moves the transferases forward and results in less glycosylated proteins .
Mucins are characterized by long extended, rod-like glycopeptide domains that are called mucin domains [6,9,10]. These are densely O-glycosylated making them look like a bottle brush with a central protein core and the glycans pointing out in all directions as bristles. The protein cores of the mucins domains are characterized by abundant Ser, Thr and Pro and are called PTS domains for the single letter code of these amino acids . Originally these sequences were called VNTR for varying number of tandem repeats . However, quite few mucins do not show any tandem repeats and do not vary in length. Instead all these sequences are characterized by long stretches of Thr and/or Ser (typically >40% of all amino acids) and Pro (typically >5%). In fact, one can use these criteria in search algorithms for mucin genes as there is no sequence conservation even between closely related species [11,13]. It seems as if only the number of glycan attachment sites and the total length of these domains have been conserved during evolution. These PTS sequences are often long (> 2,000 amino acids) and limited to one single exon. However, there are a few exceptions, where each PTS repeat is encoded in one single exon thus allowing the mucin domain length to vary by alternative splicing [11,14].
All mucins have one or several PTS domains that are either localized N-terminally, as for the type 1 membrane-bound mucins, or centrally, as for the gel-forming mucins [6,10,12,15]. The MUC7 mucin is more or less only a PTS domain . There are two types of transmembrane mucins, the SEA and NIDO-AMOP-VWD types . The SEA mucins all have a SEA domain just outside of the cell membrane. This domain is autocatalytically cleaved during biosynthesis by the folding energy, but remains held together as one molecule at the cell surface . This interaction is strong and can hold the highly O-glycosylated extracellular mucin domain with a mass >2 MDa anchored to the cell membrane, but at the same time act as a breaking point when the physical stress on the surface is too strong threatening to rupture the cell membrane . The second group, the NIDO-AMOP-vWD mucins, has only one member among mammals, the MUC4 mucin. This mucin has its three typical domains next to the large PTS domain at the extracellular cell surface. MUC4 is just as the SEA mucins cleaved during biosynthesis . Most transmembrane mucins are found in the gastrointestinal tract, MUC1, MUC3, MUC4, MUC12, MUC13, MUC16, and MUC17. The function of the transmembrane mucins is, except for their role in cancer , less well explored. These mucins are typically found on the apical side of polarized epithelial cells and extend far out from the cell membrane [15,19]. They all appeared during vertebrate evolution, in contrast to the gel-forming mucins that appeared early during metazoan evolution . In addition to the mucins, there are other important glycoproteins in the gastrointestinal tract, for example DMBT1 with its scavenger receptor repeats .
The gel-forming mucins are characterized by a long N-terminus built by three vWD assemblies, one or several PTS domains interrupted by CysD domains, and a C-terminus ending by a cysteine knot (Fig. 1) [21–23]. Humans have four expressed gel-forming mucins, MUC2, MUC5AC, MUC5B and MUC6 whereas frogs and fish have many more . All these mucins form large polymers held together by the cysteine-knot two-and-two in the C-terminus and in the N-terminus two-and-two in MUC5B or three-and-three in MUC2 [23–26]. In the gastrointestinal tract, the MUC5AC and MUC6 mucins are found in the stomach suface cells and glands, respectively, whereas the rest of the intestine has MUC2 as its typical mucin [6,23]. Due to the type of MUC2 oligomerization this mucin generates large net-like sheets that when layered on top of each other form efficient size-exclusion filters. The MUC2 mucin is produced by goblet cells where MUC2 is densely packed due to the calcium and low pH in the secretory vesicles. This packing is required for well-organized release that allows the >1,000-fold expansion upon secretion of MUC2 . To allow this expansion, sufficient levels of bicarbonate are required to remove calcium and increase the pH . Most important for this expansion is the highly O-glycosylated mucin domains that are attracting water and by this generate physical forces that are pulling out and expanding the folded MUC2 into flat sheets [26,28].
The MUC2 mucin is building the intestinal mucus skeleton. In the small intestine the mucus is less dense and after bacterial colonization non-attached to the epithelium . In the large intestine, the mucus is organized into two layers, an inner attached dense mucus that is essentially free of bacteria and an outer non-attached less dense mucus that allow bacteria to enter and thrive . The inner mucus layer of distal colon is renewed every hour and the conversion to the outer layer is dependent on host-controlled proteolysis that allows expansion of MUC2 without a directly dissolving the polymeric nature of the mucin [31,32]. The properties of the inner colon mucus layer are dependent on the commensal bacteria and when normally developed it acts as a physical filter separating bacteria as well as bacteria-sized beads [33,34]. The function of this inner mucus layer is fully relaying on the integrity of the MUC2 polymeric network.
The O-glycan repertoire on the MUC2 mucin are different between species and also show individual variation as the ABH(O), Lewis, and Secretor histoblood group systems . The human small intestinal MUC2 is fucosylated in Secretor positive individuals by the FUT2 enzyme, forming an epitope that is further extended dependent on the individual AB blood group status . Human colon shows less variability between individuals and the distal colon MUC2 glycans are relatively homogenous among humans and rich in the Sda-Cad epitope [36–39]. The human colon glycans are characterized by NeuAc attached to the C-6 of the peptide-bound GalNAc. In addition, sulfation of Gal is prominent as well as acetylation on the sialic acids .
The species differences between mice and humans are reflected at the Core structure level as for example humans are dominated by Core3, whereas mice have mainly Core1 and Core2 structures [39,41]. As only inbred specific strains of mice have been studied, we do not really know the extent of blood group variability in mice living free in nature. The C57BL/6 mice has been mostly studied and these have little fucosylation present in the normal small intestine, but more fucosylated compouds are found in colon .
Although there are specific features depending on species, all studied mucins have a very diverse glycan repertoire with up to a hundred different structures on one mucin from one source. In fact it has been calculated that all epitopes cannot be found on one and the same Muc2 mucin domain . The high number of glycan structure is partly explained by the presence of biosynthetic glycan intermediates and each organ typically has a limited number of fully ‘completed’ products. Comparing O-glycans with glycosphingolipids from the same tissue show that in the latter case only certain specific intermediates are present . This argues for differences in the glycosylation processes for these two types of glycoconjugates. The high variability and the presence of all intermediate glycans on mucins is highly functional when assuming that a major function of the mucin is to bind many different bacteria that carry surface adhesins. In this way the mucins can select certain bacteria or trap and move these away from the host.
The mucin O-glycans are closely related to the intestinal commensal bacteria . Our recent studies show that the Muc2 glycosylation of mice grown up under germ-free (GF) conditions is different than when colonized at birth e.g. conventionally raised (Conv-R) (Fig. 2) . The levels of most glycosyltransferases are increased by the presence of microbiota, whereas a few are decreased (Fig. 2). The resulting glycan structures are the result of the combined actions of all the transferases and even if for example the B4galnt2 enzyme responsible for the formation of the Sda/Cad epitope is decreased by microbiota, the levels of glycans with this structure are increased likely due to more precursors available. Overall, the Conv-R mice have more extended structures and higher fucosylation. Similar glycans as in the Conv-R mice are obtained if GF mice are colonized by normal mouse microbiota, but it takes up to three weeks until this is accomplished . There is nothing known about how the bacteria can signal this at a distance as well as the host control circuits responsible for these glycosylation alterations.
The mucin and especially MUC2 O-glycans are probably very important for the selection of the typical bacterial composition found in each species . This has been elegantly illustrated by John Rawls observing that zebrafish and mice have different normal microbiota . After the germ-free zebrafish and mice were cross-colonized with the other species’ microbiota the original zebrafish and mice characteristic microbiota was selected out from the donor bacteria. This illustrate that the host can select its microbiota. Comparing the commensal bacteria between mice and humans show differences that further supports the idea that the host can influence the selection of its bacteria [47,48]. There are probably several mechanisms by which the host can influence or select its microbiota, like the small intestinal repertoire of antimicrobial peptides/proteins [49,50]. Of course, other reasons like food or environment will also influence the bacterial composition [33,51]. However, the mucin O-glycan repertoire is probably influential as the glycans can act as attachment sites for the numerous bacterial adhesins. This is probably especially important for bacteria residing on the interphase between the inner and outer mucus layers as the mucin glycans are essentially intact at this site [31,52].
Of further importance is also the capacity of bacteria to degrade the mucin glycans provided by the host [53,54]. As the glycans varies between species, the bacteria needs to carry a repertoire of enzymes adapted to its specific host. Typical for the commensal bacteria are their rich repertoire of glycan degrading enzymes. It has been estimated that up to 40% of the bacterial genomes are encoding these types of enzymes  The released monosaccharides are used by the bacteria providing the enzymes or other members of the microbiota. The importance of the host mucin glycans, polysaccharides and other saccharides degraded by the bacteria is difficult to quantify. In any case, the host mucin glycans are important for the host microbiota as illustrated by mice that partially lack Core1 glycans in the intestine . These mice have higher levels of Bacteroidetes and lower level of Fermicutes than wild-type animals. There is a large interest in the human intestinal microbiota in relation to human health and disease and it has become common to colonize GF mice with human microbiota to study their effects on the host [56,57]. These mice have been called ‘humanized mice’, but these studies have not yet been considering the influence of mouse mucin glycans on the bacteria typical for humans.
The commensal colonic bacteria can efficiently utilize the monosaccharides released by the bacterial hydrolases and transported into the bacteria by specific transporters . Many bacteria carry large operons including sensory molecules, hydrolases and transporters . Not all bacteria have all necessary enzymes for cleaving specific sugar linkages and it is common that several bacteria collaborate in a community. The bacteria metabolize the monosaccharides to smaller molecules, most commonly to the short fatty acids acetate, propionate and butyrate . These small molecules can easily diffuse through the inner mucus layer and provide the host and its epithelium with energy. In this way, the host recovers some of the energy spent to build and renew the inner mucus layer. That this is a true symbiotic relation is illustrated by the bacterial capacity to stimulate the formation of a denser and less penetrable mucus and by this provide the host with a better protection and bacteria with more food . The details on how the bacteria can signal this to the host is not understood.
The intestinal bacteria are very efficient in degrading the mucus oligosaccharides and eventually also the protein core is degraded as there is very little mucus to be found in normal feces. At the same time, it is important that the degradation of the glycans is not too fast and by this affecting the mucus faster than it is renewed. The importance of sufficiently long O-glycans is illustrated by the Core3−/− mice that are more susceptible to dextran sulfate colitis (DSS) and the Core1−/− mice that can spontaneously develop colitis [59,60]. This is in concordance with the knowledge that mice have largely Core1 glycans, but is also highly dependent on the bacterial composition as the inflammation varies between animal facilities.
In order to maintain intact intestinal mucus it is very important that the digestive enzymes are not able to degrade the MUC2 mucin polymer in either the small or large intestine. In fact, the MUC2 mucin is not affected by the digestive proteases from pancreas or epithelium. This is largely dependent on its glycosylation as the mucin domains of MUC2 and other mucins are resistant to proteases. This is achieved by their dense glycosylation that does not allow proteases to reach the protein core. The less glycosylated N- and C-termini of MUC2 are still not degraded as these parts are highly stabilized by numerous disulfide bonds where the specific cleavage sites for the pancreatic endoproteases (trypsin, chymotrypsin and elastase) are hidden. It is also important to realize that most of the three N-terminal vWD assemblies up to the localization of the timerization are not required for maintaining the MUC2 polymeric network (Fig. 1 and and3).3). If the MUC2 mucin is exposed to trypsin the polymeric network remains intact, but once the disulfide bonds are reduced the whole molecule falls apart to small peptides and two large glycopeptides of about 350 and 700 kDa representing the two mucin PTS domains . Hosts do not secrete any enzymes in the gastrointestinal tract that can degrade the type of O-glycans that covers mucins. On the other hand, commensal bacteria carry numerous of such glycan degrading enzymes [53,57].
Pathogenic organisms have on the other hand developed mechanisms to circumvent the normal intestinal protective systems. For example, the large parasite Entamoeba histolytica that needs to penetrate the inner colon mucus layer to reach the epithelium in order to invade the host . Interestingly, the human MUC2 mucin, in contrast to mouse Muc2, has a more susceptible site for specific proteases located between the trimeric and dimeric positions (Fig. 3). Cleavages anywhere between these two points will dissolve the MUC2 polymeric network. Studies of the cysteine proteases of E. histolytica have shown that EhCP enzyme is capable of cleaving the MUC2 protein core at SIIRT↓TGLR sequence . However, this is not possible if the second Thr is O-glycosylated (Fig. 3). Interestingly, this site is only glycosylated by the peptidyl-GalNAc transferase 3 (GALNT3) (Fig. 3) . This suggests that this specific transferase could be of special importance for humans where it is the most abundant GALNT . It is known that only about 20% of E. histolytica infected individuals get invasive disease, something that could be related to the level of glycosylation by GALNT3. Individuals that totally lack the GALNT3 enzyme have the disease Familial Tumoral Calcinosis , a rare disease with hyperphosphatemia and ectopic calcifications where absence of specific O-glycosylation cause an abnormal protease processing . The lack of GALNT3 in these individuals should probably make them more susceptible to E. histolytica infection.
The oral pathogenic bacteria Porfyromonas aerogenosa produce a serine protease (RgpB) capable at cleaving MUC2 one amino acid away from the E. histolytica site, SIIR↓TTGLR (Fig. 3) . Also this cleavage is blocked by glycosylation by GALNT3. This is not a typical colonic bacteria although found there as well as in the mouth, but there are likely also other bacterial proteases capable of cleaving MUC2. Interestingly, none of the two enzymes cleaving MUC2 at this site are able to cleave MUC2 at any other position.
Genetic differences in glycosyltransferases cause altered glycans where some are giving severe disease, whereas others do not . There are also numerous publications suggesting glycosylation alterations claimed to be linked to disease [68,69]. This has been the case for the disease Cystic Fibrosis where the absence of the CFTR channel causes severe problems of the respiratory tract. Altered glycosylation of the mucins MUC5B and MUC5AC was earlier pointed out to be caused by absent CFTR channel, but later studies have shown that the glycosylation alterations are due to secondary effects of infection and inflammation [70–72].
Chronic infection by Helicobacter pylori and by this induced gastric tissue inflammation cause an altered glycan repertoire found with increased levels of sialyl-Lewis a/x antigens. The glycan changes are caused by transcriptional up-regulation of the B3GNT5, B3GALT5, and FUT3 genes .
Ulcerative colitis (UC) is also a disease where altered glycosylation has been implicated . A thorough study of the MUC2 mucin O-glycosylation of more than 50 individuals showed that control individuals and most of the UC patients with none or little inflammation had similar glycans [39,74]. The human MUC2 glycans are characterized by NeuAc2-6GalNAc and Core3 glycans terminated by Sda/Cad epitopes and Fuc and sulfate (Fig. 4). In a separate study of colon epithelium, the glycosyltransferases responsible for these normal structures were observed (Fig. 4) . It was only the UC patients with more severe inflammation and active disease that showed altered glycans as indicated by the arrows in Fig. 4. These patients had increased levels of NeuAc2-6GalNAc (STn) and lower levels of more complex glycans as for example the ones carrying Sda/Cad epitope (Fig. 4). Interestingly, when one patient with active disease and altered MUC2 O-glycans was analyzed a second time and with little inflammation in remission, the patient had returned to a normal glycan pattern . This clearly shows that the altered glycosylation also in this disease was due to the inflammation and not caused by the disease itself.
Cancer and cancer development is well-known for showing altered glycans where the most characteristic and common features include shortened glycans and increased sialylation. This topic was recently reviewed in depth .
The variable O-glycosylation on mucins generated by the high number of glycosyltransferases is important for the protective and bacterial binding capacity of mucus. A major reason for the high number of peptidyl-GalNAc and other transferases involved in O-glycan biosynthesis should be sought in the mucin interplay with microorganisms. Inflammation and infection will trigger alteration in glycosylation and may hint to general mechanisms and roles of epithelial glycans in responding to outside threats.
Studies of the commensal microbiota interaction with the host and the bacterial effects on the host metabolism and other functions will be of highest priority coming years. On a glycobiology point of view, we need to develop better experimental animal models being truly humanized. This means that they carry human mucins with human glycosylation patterns, something that require a number of gene modifications in mice. The mechanisms behind alterations of glycosyltransferase levels and activities and thus glycan repertoire on mucosal surfaces upon inflammation and infection are poorly understood. Too little attention has been placed on transcriptional and other control systems of glycosyltransferase expression.
This work was supported by the Swedish Research Council, The Swedish Cancer Foundation, The Knut and Alice Wallenberg Foundation, IngaBritt and Arne Lundberg Foundation, Sahlgren's University Hospital (ALF), Wilhelm and Martina Lundgren’s Foundation, Torsten Söderberg Foundation, The Sahlgrenska Academy, National Institute of Allergy and Infectious Diseases (U01AI095473, the content is solely the responsibility of the authors and does not necessarily represent the official views of the NIH), The Swedish Foundation for Strategic Research - The Mucus-Bacteria-Colitis Center (MBC) of the Innate Immunity Program, Cystic Fibrosis Foundation (CFF), and Lederhausen’s Center for CF Research at Univ. Gothenburg.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.