|Home | About | Journals | Submit | Contact Us | Français|
Glycosylation is an essential form of post-translational modification that regulates intracellular and extracellular processes. Regrettably, conventional biochemical and genetic methods often fall short for the study of glycans, because their structures are often not precisely defined at the genetic level. To address this deficiency, chemists have developed technologies to perturb glycan biosynthesis, profile their presentation at the systems level, and perceive their spatial distribution. These tools have identified potential disease biomarkers and ways to monitor dynamic changes to the glycome in living organisms. Still, glycosylation remains the underexplored frontier of many biological systems. In this Account, we focus on research in our laboratory that seeks to transform the study of glycan function from a challenge to routine practice.
In studies of proteins and nucleic acids, functional studies have often relied on genetic manipulations to perturb structure. Though not directly subject to mutation, we can determine glycan structure−function relationships by synthesizing defined glycoconjugates or by altering natural glycosylation pathways. Chemical syntheses of uniform glycoproteins and polymeric glycoprotein mimics have facilitated the study of individual glycoconjugates in the absence of glycan microheterogeneity. Alternatively, selective inhibition or activation of glycosyltransferases or glycosidases can define the biological roles of the corresponding glycans. Investigators have developed tools including small molecule inhibitors, decoy substrates, and engineered proteins to modify cellular glycans. Current approaches offer a precision approaching that of genetic control.
Genomic and proteomic profiling form a basis for biological discovery. Glycans also present a rich matrix of information that adapts rapidly to changing environs. Glycomic and glycoproteomic analyses via microarrays and mass spectrometry are beginning to characterize alterations in glycans that correlate with disease. These approaches have already identified several cancer biomarkers. Metabolic labeling can identify recently synthesized glycans and thus directly track glycan dynamics. This approach can highlight changes in physiology or environment and may be more informative than steady-state analyses. Together, glycomic and metabolic labeling techniques provide a comprehensive description of glycosylation as a foundation for hypothesis generation.
Direct visualization of proteins via the green fluorescent protein (GFP) and its congeners has revolutionized the field of protein dynamics. Similarly, the ability to perceive the spatial organization of glycans could transform our understanding of their role in development, infection, and disease progression. Fluorescent tagging in cultured cells and developing organisms has revealed important insights into the dynamics of these structures during growth and development. These results have highlighted the need for additional imaging probes.
Virtually every class of biomolecule can be found in a glycosylated form. This phenomenon extends from the glycoproteins, which we now know comprise ~50% of the total cellular proteome and >90% of the secreted proteome,1,2 to lipids, tRNA,(3) and many secondary metabolites (Figure (Figure1).1). But the question, “what do the glycans do?” remains unanswered in many cases. Decades of research in the rapidly expanding field of glycobiology have provided some insights. For example, glycans have been shown to govern biological homeostasis, playing central roles in protein folding, trafficking, and stability,(4) and in organ development.(5) Inside cells, protein glycosylation is thought to play a role in signaling, perhaps in concert with phosphorylation.(6) Cell-surface glycans are poised to mediate intercellular communication,(7) including pathogen recognition,8,9 and to distinguish self from non-self immunologically.(10) In addition, the glycosylation state of both cell-surface proteins and lipids responds to external stimuli and internal cellular dysfunction. Thus, the dynamics of these molecules reflect the cell’s physiological state and can report on disease.(11)
Historically, approaches to studying glycans reflected the standard tactics of biological inquiry that were developed in the context of proteins and nucleic acids: (1) alter the structure or expression level and evaluate the biological consequence (i.e., perturb); (2) define the molecular inventory as a function of physiology (i.e., profile); (3) visualize the molecule in a living system to understand its distribution and dynamics (i.e., perceive). Based primarily in genetics and biochemistry, the experimental tools used to accomplish these goals for proteins and nucleic acids did not always translate to the study of glycans. For example, perturbation of glycan structures can be achieved by genetic mutation of glycosyltransferases, but the effects of such mutations are often masked by embryonic lethality or compensatory upregulation of redundant enzymes.12,13 Lectins and antibodies with defined glycan specificities can be used to profile cell-surface glycans and to correlate global changes in their expression with developmental stages and disease.(14) Until recently, however, the available lectins and antibodies were limited in number. Finally, visualizing glycans in living systems is an unmet challenge for which no conventional experimental approach is suited. The ability to perceive these biopolymers as they undergo dynamic changes within organisms could transform our view of glycobiology.
New techniques derived from physical, analytical, and synthetic chemistry are starting to address many of the inadequacies of the conventional toolbox as applied to glycans. Several groups have contributed in important ways to the burgeoning field of chemical glycobiology. Their contributions include small molecules that interfere with glycan biosynthesis,15−17 glycopolymers that modulate carbohydrate receptor activity,(18) and synthetic methods for assembling glycoconjugates.19−22 Furthermore, analytical tools such as lectin microarrays and mass spectrometry are providing, for the first time, detailed pictures of the “glycome”.23,24 Due to restrictions of space, this Account will focus primarily on our own efforts to develop small molecules to perturb, profile, and perceive glycans.
In principle, the synthesis of chemically defined glycoconjugates allows researchers to perturb biological systems in a tightly controlled fashion. Though syntheses of singly substituted glycopeptides are routinely achieved, the complexity of most glycoproteins and glycopeptides presents a formidable challenge. Full details of this approach have been recently reviewed.(25) A notable success comes from Danishefsky and colleagues who synthesized portions of the highly glycosylated protein erythropoietin A (EPO) by joining chemically synthesized glycopeptides via native chemical ligation.(26) These glycosylated peptides were more than 20 amino acids in length and contained elongated N- and O-linked glycans, thus providing a framework for the total synthesis of a complex glycoprotein. Semisynthesis of a full-length glycoprotein, GlyCAM-1, has been achieved via the combination of glycopeptide synthesis and expressed protein ligation technology. The protein included up to 13 sites of O-linked glycosylation (α-GalNAc-modified serines and threonines) located on two separate mucin domains of the protein.(27) Despite these remarkable achievements, the technical challenges associated with glycoprotein syntheses will likely prevent their routine application.
An alternative strategy to glycoprotein total synthesis is the development of synthetically tractable glycoconjugate analogs. Mucins are proteins containing repeating units of densely glycosylated domains whose biological functions often correspond to the degree of glycosylation. Recently we have reported the use of chemically defined mucin mimics to study glycans in synthetic lipid bilayers and cell surfaces.28,29 The condensation of aminooxy-sugars with polymethylvinylketone gave highly modified glycopolymers (Figure (Figure2).2). These glycoconjugates resembled native mucins in size, macromolecular structure, and biophysical properties. Incorporated into synthetic membranes or cell surfaces, these glycopolymers interact with carbohydrate binding proteins (lectins) in a manner specific to the appended glycan’s composition, arrangement, and stereochemistry. The polymers also demonstrate surface two-dimensional mobilities similar to cell-surface lipid-bound mucins. These polymers are a promising alternative strategy for studying the role of glycans in both intra- and intercellular receptor-mediated signaling.
Small molecule inhibitors of glycan biosynthesis can alter glycan structures.(30) Cell-surface glycans are constructed by the stochastic action of spatially constrained ER- and Golgi-associated glycosyltransferases as their target glycoconjugates traverse the secretory pathway. The roughly 250 glycosyltransferases predicted in mammalian genomes are attractive targets for small molecule inhibition and could help elucidate their roles in dynamic cellular processes that are opaque to course-grained genetic manipulations. To serve as useful tools for biological inquiry, glycosyltransferase inhibitors must be selective for their target, active in cells, and potentially active in living organisms. Several natural products are known inhibitors of glycosylation, including tunicamycin, which blocks the biosynthesis of N-linked glycan precursors, and deoxynojirimycin, which broadly inhibits certain glycan processing enzymes. These molecules and their relatives have been widely employed to study the effects of glycosylation on protein function, but off-target effects limit their utility.
Chemical biologists have sought to expand upon this limited toolkit with the design of inhibitors that target specific enzymes. Pioneering work by Imperiali and co-workers targeted the enzymes responsible for the initiation of N-linked glycosylation. The researchers synthesized a potent inhibitor of oligosaccharyl transferase (OST), the enzyme that transfers N-linked glycans to proteins in the ER.31,32 The compound was based on a constrained peptide that approximated the natural substrate’s ASX turn. Once optimized for cell-based environments, the compound promises to disrupt N-linked glycosylation in a more refined fashion than tunicamycin. Despite this and other notable successes in this area, the structural homology and functional overlap of the more than 200 glycosyltransferases makes inhibition of individual glycosyltransferases a challenge.
An alternative strategy for modulating cellular glycan structures involves “primers of glycosylation”. In general, these compounds are simple naturally occurring sugars adorned with hydrophobic aglycones such as benzyl, phenyl, and naphthyl groups. Including β-xylosides,(33) α-benzyl GalNAc,(34) and various substituted disaccharides, the primers act as competitive substrates for glycosyltransferases inside cells.(35) At high concentrations, the primers out compete endogenous glycans as acceptor substrates for glycosyltransferases. Thus, glycan chain elongation occurs on the soluble primers rather than on endogenous glycoproteins and glycolipids, and eventually, the modified primers are secreted from the cell (Figure (Figure3A).3A). In recent years, Esko and co-workers have employed a disaccharide primer, α-napthyl GlcNAcβ1−3Gal, to inhibit the elaboration of cellular N-acetyllactosamine epitopes with sialic acid (Sia) and fucose (Fuc) residues to form the sialyl Lewis X (sLeX) motif.35−38 Tumor-associated sLeX binds to P-selectin, a receptor on activated endothelial cells, and is thought to mediate the extravasion of tumor cells from the bloodstream into organs.(39) Treatment of LS180 cells with the disaccharide primer markedly decreased cell-surface expression of sLex. The cells displayed diminished capacity to colonize the lungs of nude mice, suggesting inhibition of this pathway may be a viable approach for controlling metastasis.
Conservative modifications of monosaccharide structures are sometimes tolerated by biosynthetic enzymes, enabling the metabolic incorporation of the sugar analogs into cellular glycans (Figure (Figure3B).3B). The modification can disrupt recognition by downstream elaborating enzymes, or by receptors that eventually govern the glycan’s function. In a pioneering study, Reutter and co-workers demonstrated that unnatural variants of N-acetylmannosamine (ManNAc), in which the N-acyl substituent is elaborated, are recognized by the sialic acid biosynthetic machinery. Thus, treatment of cells with N-propanoylmannosamine (ManNProp) leads to replacement of a fraction of their cell surface sialic acid residues with the N-propanoyl derivative.(40) This structural perturbation interfered with sialic acid-dependent viral infection.(8)
A similar technique was employed to induce premature termination of glycan chains in neurons. Poly(sialic acid) (PSA), a homopolymer comprising 2,8-linked sialic acid residues, is a modification of the neural cell adhesion molecule (NCAM) that has been linked to neuronal plasticity.(41) Neurons grown in the presence of N-butanoyl mannosamine (ManNBut) were found to have shortened PSA chains, presumably due to inefficient extension of PSA polymers incorporating this substrate by polysialyltransferases.(42) In the future, biosynthetic modulation of PSA length using ManNBut might be applied to studies of PSA in neuronal development.
In addition to unnatural sialosides, derivatives of N-acetylglucosamine (GlcNAc) and galactose (Gal) have been incorporated into glycans using cellular machinery. Mice and human cell lines are able to incorporate 4-fluoro-GlcNAc into mucin-type O-linked glycans.43,44 Incorporation of this modification terminates further elongation of glycan chains by removal of the acceptor nucleophile, thereby diminishing the production of epitopes that bind to E-selectin, a receptor on activated endothelial cells involved in inflammation. As a consequence, T-cells do not bind to the endothelial cells and homing to the inflamed tissue is blocked.
In an application to neurobiology, chain-terminating sugars were employed to study the processes of neurite outgrowth and synapse formation. Fucα1−2Gal has long been implicated in cognitive processes.(45) Hsieh-Wilson and co-workers studied the details of this process through the use of 2-deoxy-Gal (2-d-Gal), which is incorporated into cellular glycans thereby blocking formation of the Fucα1−2Gal epitope.(46) Synapsin, a protein involved in neurite outgrowth and neurotransmitter release, was rapidly degraded by calpain-mediated proteolysis in the absence of its normal Fucα1−2Gal modification. This observation suggests a critical role for Fucα1−2Gal in neuronal plasticity.
Inhibitors, primers, and unnatural substrates share the ability to downregulate glycosyation in a temporally controlled manner. However, the high degree of homology between related glycosyltransferases makes discrimination between these enzymes with specific small molecules a challenge. A complementary approach that we developed merges genetic and molecular approches to allow temporal regulation of specific glycosyltransferases using a small molecule switch. The approach exploits the conserved modular architecture of most Golgi-resident glycosyltransferases, which comprise functionally distinct catalytic (CAT) and localization (LOC) domains that must be physically associated for activity on their cellular substrates. We fused separated CAT and LOC domains to the FK506 binding protein (FKBP) and the FKB-rapamycin binding domain (FRB) of the mammalian target of rapamycin (mTOR), respectively.(47) In the absence of rapamycin, FKBP and FRB show negligible affinity for one another.(48) As a result, the LOC domain localizes normally to the Golgi membrane, while the catalytic domain is secreted from the cell, effectively limiting access to its substrates (Figure (Figure4A).4A). The addition of rapamycin initiates heterodimerization of FKBP and FRB, reconstituting the native localization of the CAT domain and restoring its enzymatic activity (Figure (Figure4B).4B). The approach was applied to glycosyltransferases47,49 and to sulfotransferases(50) that act on glycans.
Decoding the information stored in the glycome is a major impetus of the field of glycobiology. Chemical approaches can vastly accelerate this process by enabling molecular-level analyses of glycans at the systems level. Furthermore, profiles of glycans associated with disease can provide new biomarkers for diagnosis and therapeutic monitoring.51,52 Three complementary approaches for profiling glycans are now assuming prominent positions: lectin/antibody arrays, mass spectrometry, and metabolic labeling.
Lectins are nonantibody carbohydrate binding proteins that exhibit specificity for one or more glycan-associated epitopes. Historically, lectins have been used in low-throughput agglutination assays to categorize cell surfaces.(53) Advances in microarray technology coupled with the genome-driven discovery of new lectins have enabled more rapid and detailed analyses of glycosylation. Recently, Kuno and co-workers used evanescent fluorescence to quantify the binding of diverse Cy5-labeled glycoproteins to an array of 39 different lectins.(54) Mahal and co-workers used a similar approach to characterize bacterial cell surface glycomes.(23) In an extension of this theme, Haab and co-workers used microarrayed antibodies to capture specific glycoproteins from the serum of healthy donors and cancer patients to individual array features. Through use of a type of sandwich assay, lectin binding to the captured species identified specific glycan changes unique to the cancer patients.(55) While these techniques can detect both the identity and concentration of certain glycan epitopes, not every glycan structure has a corresponding lectin and many lectins show promiscuous binding. Nonetheless, lectin arrays can provide insight into glycosylation patterns associated with disease.
Comprehensive structural analysis of cell-derived glycans can be achieved using mass spectrometry. For decades, mass spectrometry has been the tool of choice for sequencing the oligosaccharide components of purified glycoproteins.56,57 With the advent of ultrasensitive and ultrahigh-resolution mass spectrometry techniques, entire glycomes can now be analyzed from complex cell lysates or bodily fluids. Already, distinct glycan profiles have been identified in sera from patients with ovarian,(51) breast,(52) and prostate(58) cancers using mass spectrometry profiling techniques. These discoveries represent the first step in identifying new clinical biomarkers.
Both lectin arrays and mass spectrometric techniques provide a snapshot of glycosylation at steady state. The dynamics of glycosylation, still difficult to analyze using these techniques, may provide additional information relevant to disease monitoring or to fundamental studies of glycobiology. For instance, cancer cells are known to metabolize simple sugars more rapidly than healthy cells, a trait that has been exploited in diagnostic imaging.(59) Techniques that assess glycosylation dynamics, including monosaccharide uptake and processing, glycan biosynthesis, and membrane turnover, could provide a new dimension to glycome profiling.
We have made use of the bioorthogonal chemical reporter strategy to specifically tag newly synthesized glycoconjugates, which can then be purified and inventoried. In this procedure, an unnatural monosaccharide containing a small, bioorthogonal functional group is incorporated into glycans using the cell’s metabolic machinery.(60) Much like a pulse−chase experiment, the incubation time and concentration of the synthetic sugar can be varied so that it labels only the most recently synthesized glycans and not the entire glycome. Subsequently, a reagent specific for the bioorthogonal functionality is used to modify the newly synthesized glycans with a probe for capture and enrichment (Figure (Figure55A).
The azide has proven to be the most versatile bioorthogonal functional group for metabolic labeling of glycans, because it is small, abiotic, and stable in cellular systems. Sia,(61) GalNAc,(62) GlcNAc,(63) and Fuc64,65 can all be substituted with azido analogs in cultured cells, and Sia and GalNAc can be labeled with azides in mice (Figure (Figure5B).5B). In contrast to chain terminators, the azide-bearing sugars are not intended to disrupt further glycan elaboration and are generally well-tolerated by cultured cells and mice.62,66,77 Once installed in a glycan, the azide can be selectively reacted with phosphines via the Staudinger ligation, with terminal alkynes via Cu-mediated “click chemistry”, or with cyclooctynes by a strain-promoted [3 + 2] cycloaddition65,67−70 (Figure (Figure5C).5C). These reactions allow chemical labeling of the azido glycan with any probe of choice (e.g., biotin, FLAG peptide, or fluorescent dyes). Fucose and sialic acids modified by terminal alkynes can also be incorporated onto cell surfaces and imaged via click chemistry (Figure (Figure55B);65,71 however, the inherent toxicity of the copper reagents precludes imaging of live organisms.
The bioorthogonal chemical reporter strategy has been artfully employed in the proteomic analysis of protein O-GlcNAcylation, a form of glycosylation that is unique to cytosolic and nuclear proteins and thought to be a key modulator of protein function.(6) Despite its potential importance, the complete repertoire of O-GlcNAcylated proteins is not known. We demonstrated that GlcNAcylated proteins can be metabolically labeled with the azido analog N-azidoacetylglucosamine (GlcNAz), and then biotinylated by Staudinger ligation with a phosphine probe.(63) Zhao and co-workers exploited this discovery to capture O-GlcNAcylated proteins for a comprehensive proteomic analysis of cell lysates.(72)
In another application, the bioorthogonal chemical reporter method was applied to analyze mucin-type O-linked glycoproteins from cell lysates(62) and murine tissues.(73)N-Azidoacetylgalactosamine (GalNAz) was found to label the O-linked glycans in both contexts, substituting for the conserved glycan core residue GalNAc. Interestingly, the GalNAz labeling efficiency of B cells from treated mice was considerably higher than that of T cells, a result that was not predicted based on their steady-state glycomes. This observation underscores the differences in information provided by metabolic labeling versus lectin or mass spectrometry methods of glycan profiling.
To understand how changes in glycosylation relate to physiological changes requires a means to visualize these biopolymers in a physiologically relevant context. Tagging proteins with fluorescent-protein fusions has proven to be an invaluable technique for tracking their localization and function in living cells. Analogous tools should be sought for probing the abundance, distribution, and dynamics of glycans in vivo. The bioorthogonal chemical reporter strategy holds much promise in this regard. Glycans can be labeled with SiaNAz and GalNAz in living animals.66,73 In addition, phosphine probes injected into these animals undergo the Staudinger ligation and accumulate on cell surfaces in an azide-dependent manner. The stage has thus been set for the delivery of imaging probes that can report on the presence of specific glycan subtypes.
The development of imaging reagents that target azides is an important next step. Two recent examples include the development of phosphine- or alkyne-functionalized dyes that fluoresce only upon reaction with azides and thereby label cell surface glycans with low background staining.65,71,74 The application of these reagents in living animals, however, may be limited. The Staudinger ligation is plagued by slow kinetics and the Cu-mediated “click chemistry” requires a toxic heavy metal catalyst. To overcome these problems, we recently developed a difluorinated cyclooctyne reagent, termed DIFO (Figure (Figure5C(iii))5C(iii)) that rapidly reacts with azides and is nontoxic in mice.75,76 The rapid kinetics of DIFO−azide reactions were exploited to monitor the dynamics of cellular glycosylation during zebrafish embryo development. Zebrafish embryos labeled with GalNAz were tagged with fluorescently labeled DIFO reagents to image the fish’s total O-linked glycosylation.(77) We extended this technique to monitor dynamic glycosylation by quenching unreacted azide groups on labeled cells with TCEP, pulsing the embryos with additional GalNAz, and probing the embryos with a blue-shifted DIFO conjugate (Figure (Figure6).6). Areas of rapid O-linked glycan biosynthesis including the fins, jaw, and olfactory organs showed increased labeling with the blue-shifted conjugate. This quench, label, tag sequence exploits the high reactivity of DIFO to visualize glycan concentrations below the detection limit for comparable phosphine probes. In addition, these small molecule-based reagents are more accessible to the tissues than lectin or antibody-based imaging agents. Post-labeling, the zebrafish appeared to continue normal development, confirming the low toxicity of these reagents. The extension of this work to mammalian systems should enable imaging of tumors and sites of microbial infection.
Chemical tools for perturbing, profiling, and perceiving glycans should reduce the barriers that have previously impeded progress in this field. Their implementation has already produced a body of knowledge that was formerly limited to the domain of proteins and nucleic acids. For example, a developing glycomic database has begun to facilitate bioinformatics studies on glycans.(78) As mentioned above, profiles of glycans from tissues and bodily fluids have produced new biomarker candidates. Many regents for perturbing glycans, including inhibitors and primers, are now commercially available. Similarly, azidosugars and both phosphine- and alkyne-based labeling reagents can now be obtained from commercial suppliers. The future is bright for use of these tools in analyzing the dynamic glycomes of normal and diseased tissues.
The authors’ work that was mentioned in this review was primarily supported by grants to C.R.B. from the National Institutes of Health (GM66047, GM58867, and GM59907).
Nicholas J. Agard is a NIH postdoctoral fellow at the University of California, San Francisco. He completed his undergraduate degree in Chemistry from Brown University in 2002 and his Ph. D. in chemical biology from the University of California, Berkeley, in 2007 with a thesis focused on glycan perturbation and detection. Currently he is studying proteolysis in innate immunity.
Carolyn Bertozzi is a Howard Hughes Medical Investigator, Director of the Molecular Foundry, and the T. Z. and Ingrid Chu Distinguished Professor of Chemistry and Molecular and Cellular Biology at the University of California, Berkeley. She received her undergraduate degree from Harvard University in 1988 and her Ph. D. from the Univerity of California, Berkeley, in 1993. She has published more than 150 scientific articles and received numerous honors including memberships in the National Academy of Science and the American Academy of Arts and Sciences. Professor Bertozzi’s current research investigates a number of fields including glycobiology, nanomaterials, and molecular imaging.