|Home | About | Journals | Submit | Contact Us | Français|
Carbohydrates encode biological information necessary for cellular function. The structural diversity and complexity of these sugar residues have necessitated the creation of novel methodologies for their study. This review highlights recent technological advancements that are starting to unravel the intricate web of carbohydrate biology. New methods for the analysis of both glycoconjugates and glycan structures are discussed. With the use of these innovative tools, the field of glycobiology is poised to take center-stage in the post-genomic era of modern biology and medicine.
Glycomics or the systems-level study of carbohydrates (glycans) has emerged as an important area of research in the post-genomic era. Although proteins are tagged with a variety of post-translational modifications including phosphorylation, acetylation, and methylation, glycosylation is the most prevalent modification, occurring on at least 50% of all proteins (1). The vast majority of glycoproteins and glycolipids are found at the cell membrane creating a carbohydrate-coated surface which communicates with the extracellular world. Interactions with cell surface oligosaccharides play a critical role in numerous biological events including differentiation, cellular adhesion, immune responses, and host-pathogen interactions (2–7). For example, the expression of an α-2,6-sialyltransferase (ST6GALNAC5) has been shown to mediate metastasis of breast cancer cells to the brain by enhancing adhesion of tumor cells to brain tissue (8). Although there is increasing evidence for critical biological roles of carbohydrates, the challenges posed by the diversity of glycans have slowed progress in understanding this class of biopolymers. This review delineates the challenge of glycomics and how it is being met by the advent of new technologies to analyze both structural and functional aspects of carbohydrates.
Glycoconjugates, defined as both glycolipids and glycoproteins, are known to occur in organisms ranging from bacteria to mammals. Most work in glycomics has focused on mammalian glycans which consist of 10 monosaccharide units that form the building blocks for complex oligosaccharide polymers (Figure 1, panel a) (9). In comparison to nucleic acids and proteins, carbohydrates possess certain unique properties that render their analysis difficult. First, monosaccharide monomers can be connected via one of many hydroxyl groups, resulting in a diverse set of linkages that form either linear or branched structures. In addition, stereochemistry at the linkage site can be either α or β, resulting in a high number of possible glycan epitopes (10, 11). For example, two mannose residues can theoretically be connected by α-1,2, α-1,3, α-1,4, α-1,6, β-1,2, β-1,3, β-1,4, or β-1,6 linkages, although only some of these connections have been observed (Figure 1, panel b). Carbohydrate residues can be further modified by sulfation, phosphorylation and acetylation (2, 12–14). This stereochemical and structural diversity provides a major analytical challenge to glycomics.
Glycan analysis is also complicated by the multitude of contexts in which this modification is found. In proteins, glycans can be attached at N-linked (Asn), O-linked (Ser/Thr) and C-linked (Trp) sites. Additionally they are found in glycolipids including globosides and glycosphingolipids (9, 13). N-linked glycosylation occurs at the consensus sequence Asn-X-Ser/Thr (X= any amino acid except proline), and starts with the transfer of the core glycan GlcNAc2Man9Glc3 from a dolichol phosphate precursor to the asparagine residue. Not all predicted sites are modified and these differences in glycan occupancy contribute to the heterogeneous composition of glycoproteins (microheterogeneity) (2, 12, 14–16). Subsequent trimming and modification of the common core N-glycan creates a series of discrete glycan structures (Figure 2, panel a). In contrast, O-linked glycans are typically elaborated by subsequent additions of monosaccharides to an α-linked N-acetylgalactosamine (GalNAc) moiety linked to Ser/Thr residues (Figure 2, panel b). Unlike N-linked glycosylation, no consensus sequence has yet been found for this modification. Glycosylaminoglycans (GAGs), another common form of Ser/Thr linked sugars, are elaborated off of a xylose core attached to the amino acid. In addition, fucose, mannose, and glucose have also been observed on Ser/Thr residues at the cell surface (2, 13, 17). In the nucleus and cytoplasm, the simple monosaccharide, β-O-N-acetylglucosamine (O-GlcNAc) is known to modify Ser/Thr residues and has been implicated in cellular signaling (18, 19). Glycolipids can also differ in their structures. For example, ceramide containing lipids can be elaborated from either a galactose or glucose core residue attached to the lipid portion (9). Taken together, this diversity of contexts presents a significant challenge for glycomics.
Our understanding of glycans is also complicated by the fact that synthesis of glycan motifs is not template driven i.e there is no direct genetic code for this complex modification. Several factors including levels of glycosyltransferases, glycosidases, nucleotide sugar transporters, and protein and lipid trafficking contribute significantly to the glycome of an organism or cell (12, 20). Redundancy in the glycosyltransferases further hampers our ability to dissect glycan synthesis as demonstrated by gene knockout experiments (21). Consequently, we are still striving to predict the glycome from genomic information, a current challenge in the field (22–24).
A final complicating factor in our understanding of glycosylation is the multivalent nature of glycan-protein interactions in biology. Monovalent interactions between carbohydrates and proteins such as lectins are typically in the millimolar to micromolar range (25). However, physiological interactions are often effected by increased avidity due to multivalent binding of carbohydrates to proteins. Local glycan concentration is determined by both the mutivalent presentation of glycans within a single protein and the clustering of glycoproteins and glycolipids on the cell surface membrane (2, 25–28). Thus, it is important to examine glycan-protein interactions using appropriate methodologies which take into account the effects of multivalency and presentation. The biology of the glycome is a reflection of all these different factors, and requires a multi-pronged approach to comprehend in its entirety.
Meeting the challenge of glycomics requires the integration of techniques from multiple disciplines including chemistry, biochemistry, molecular biology, bioinformatics and biotechnology. Herein we provide an overview of these new and exciting methods. The first section focuses on the application of several technologies towards expanding our knowledge of the glycoproteome. The second section covers technologies for determining the structure of the glycome including mass spectrometry, HPLC and lectin microarrays. The final section provides an overview of an emerging technology for studying glycan-protein interactions – carbohydrate microarrays. Taken together, these techniques are stimulating a new understanding of the glycome.
To understand the role of glycans in modulating protein and lipid function, it is necessary to identify the molecular context of these oligosaccharides. The majority of work in this area has focused on glycoproteomics, a sub-discipline of proteomics that deals with the identification of proteins bearing glycans within the complex proteomic milieu, rather than on the carbohydrates themselves. In a typical glycoproteomic analysis, proteins are enzymatically cleaved using proteases and the glycopeptides are isolated using a variety of techniques. These peptides are then deglycosylated and the glycoproteins identified using mass spectrometry (Figure 3, panel a).
The isolation techniques employed are the main points of difference between different glycoproteomic strategies (Figure 3, panel b). The most widely applied isolation technique involves the use of lectin affinity chromatography for the separation of proteins modified with glycans (29–31). Lectins are carbohydrate-binding proteins that are not enzymes or antibodies, but which can bind to specific sugar residues (32, 33). Thus, these reagents can be used to differentially isolate subpopulations of the glycosylated pool that contain a particular sugar moiety. Glycopeptides that are selectively enriched on the lectin columns are often eluted by incubation with competing sugar molecules (29, 31). This general procedure has been used primarily for isolation of N-linked glycoproteins (34–36), although the O-linked subset of the proteome can also be characterized (37, 38). The choice of lectin affinity column determines whether glycosylated proteins containing a specific sugar residue (39, 40), for example fucosylated glycoproteins (36, 41), or a wider subset of the glycoproteome is targeted (42–44). Glycoproteins from a diverse array of samples including sera (42, 45), cancer cells (34, 35) and tissues (46, 47) have been identified using this method. For example, specific changes in the fucosylation of target biomarkers in the sera of patients with hepatocellular carcinoma using this technique (41). One limitation of this technology however is that the lectin column chosen biases which glycoproteins are identified based on glycan composition, an issue for identifying the whole glycoproteome rather than a specific subset.
In order to increase coverage of the glycoproteome, several chemical approaches have been developed for isolation (Figure 3, panel b). One of the simplest methods is the enrichment of glycopeptides through hydrogen bonding interactions with cellulose or sepharose matrices (hydrophilic affinity chromatography, HILIC) (48–51). Although this method can be used for identifying both N-and O-linked glycoproteins, the reproducibility of the results is not ideal (52). Boronic acid based enrichment techniques have also been employed for the separation of glycoproteins. Boronic acid conjugated to a solid support forms a covalent bond with molecules containing cis-diol groups such as mannose and galactose and thus can selectively isolate glycopeptides (53, 54). An alternate method for identifying the N-linked glycoproteome relies on a combination of mild oxidation and the N-linked specific enzyme, PNGase F. Sodium periodate treatment oxidizes cis-diol groups in carbohydrate residues to aldehydes. These modified proteins are then reacted with either biotin hydrazide, for isolation with avidin beads, or with hydrazide beads, thereby pulling down glycoproteins selectively. Treatment of the solid support with proteases followed by PNGase F releases the N-linked glycopeptides which are subsequently identified (55–59). Recently, this technique was selectively applied to the identification of cell surface glycoproteins and used to visualize dynamic glycoproteomic changes that occur during T-cell activation and stem cell differentiation (60, 61).
Combinations of these approaches have recently been used to maximize coverage of the glycoproteome. McDonald and co-workers demonstrated that isolation of N-linked glycoproteins by the hydrazide capture method yielded dramatic differences in the proteins identified when compared to lectin affinity chromatography. Thus, these technologies can be used in tandem to isolate a broader spectrum of the N-linked glycoproteome (62). Boronic acid based enrichment has also been used together with lectin affinity chromatography to enhance the range of glycoproteins identified (63). Using hydrophilic affinity chromatography and hydrazide capture methods in parallel, a broader dataset of ~300 identified glycoproteins was obtained for the secretome of cancer cells (64). The use of these methodologies, either in parallel or in tandem, is still only capable of identifying fractions of the glycoproteome, thus emphasizing the need for further improvements in this area.
An alternative technique for selective isolation of a subset of the glycosylated pool takes advantage of chemical probes. These probes are unnatural analogs of simple sugars containing a bioorthogonal reactive handle. Typically, cells are grown in the presence of these probes, which are metabolized and incorporated into newly synthesized glycans via native biosynthetic machinery. A compatible reagent that reacts with the bioorthogonal handle is then used to either track these modified glycoproteins in living cells for in vivo imaging (65–67) or to isolate tagged proteins for glycoproteomic analysis (68) (Figure 4). For example, recent work has identified the sialylated portion of the glycoproteome using an alkynyl modified N-acetylmannosamine precursor to orthogonally label the sialoproteome. Selective biotinylation via a Huigsen [3+2] cycloaddition to a biotin-azide, followed by pulldown with streptavidin beads allowed for the identification of the tagged glycoproteins. Although this method cannot be used for clinical samples, it is still a valuable technique as unnatural analogs of different sugars can be used to identify specific subsets of the glycoproteome (68). This strategy has also been adapted for the metabolic labeling of proteins modified with O-GlcNAc using an azide derivative of GlcNAc (69, 70). An alternate scheme to identify O-GlcNAcylated proteins exploits the promiscuity of a mutant glycosyltransferase to label O-GlcNAc residues with a ketone modified sugar analog. The ketone can then be used for isolation and subsequent identification of the glycoproteins (71–73). The development and application of these novel chemical tools has added a new dimension to the field of glycoproteomic analysis.
One drawback common to these methods is that they cannot accurately identify the sites of glycan modification in proteins. Typically, the potential N-linked glycosylation sites in glycopeptides are assigned based on the presence of the consensus sequence (36, 49, 52). For a predicted N-linked glycan the site occupancy can vary (microheterogeneity), multiple consensus sequences may exist within a peptide, and nonglycosylated peptides can contaminate glycopeptide isolations, thus it is important to incorporate additional methods to accurately identify the sites of glycosylation in a glycoproteomic analysis (74, 75). One method relies on the endoglycosidases Endo D and H which cleave the bond connecting the two GlcNAc residues in N-linked sugars. The use of these enzymes, in tandem with other glycosidases, leaves the first GlcNAc residue at sites of glycosylation allowing site specific detection via mass spectrometry (50, 76). A more widely used method for identifying these sites is based on the use of PNGase F, which converts the asparagine into aspartic acid during deglycosylation. This change can be detected via either the subtle shift in mass (1 Da) (77) or the incorporation of an isotopic label. Deglycosylation in the presence of H2 18O incorporates the oxygen isotope at sites of N-glycosylation creating a more detectable mass signature (3 Da) (74, 75). For both methods the mass signature combined with the presence of the consensus sequence is used to identify specific sites of occupancy. Isotopic incorporation has been used to identify novel sites of glycosylation in human platelet proteins (78). Additionally, sites of O-GlcNAcylation have been identified using beta elimination followed by a Michael addition by dithiothreitol to tag the modified site (79). The use of both traditional and newly developed chemical tools has significantly enhanced the field of glycoproteomics.
Glycomic analysis by mass spectrometry presents an inherently greater challenge than glycoproteomic analysis. The structural variations in glycan linkages coupled with the identical mass of epimeric monosaccharides makes identification of glycan structures difficult (14, 80). Mass spectrometric analysis of glycans typically consists of the following steps 1) glycan release by either PNGase F treatment (for N-linked glycans) or β-elimination (for O-linked glycans) 2) glycan isolation using solid phase separation techniques 3) glycan derivatization and ionization and 4) annotation of the resulting spectra to list possible glycan structures (Figure 5) (80–87). In the past few years, significant advances have been made in the isolation, derivatization, and ionization techniques used for glycomic analysis. These methods increased reproducibility and considerably reduced the time required for glycomics by mass spectrometry (for a comprehensive review on this topic, see (80)). However, the interpretation of spectral data is still a major bottleneck for high-throughput analysis. This is due both to the mass degeneracy of glycans and to the complicated fragmentation patterns which arise from the more complex MS strategies required to obtain precise structural information. Although, programs exist to automate some aspects of data interpretation such as Cartoonist, GlycoPep DB and SysBio Ware, software based interpretation of glycan spectral data is still in its infancy (88–90). The recent advances in mass spectrometry have made it a widely used technique for the detection of glycan epitopes from a wide range of samples including bacterial and mammalian cells, sera, and tissue samples (80, 82, 91–97). In recent work, mass spectrometry was used to identify glycan epitopes which were specifically elevated in hepatocellular carcinoma patients in a study involving ~200 participants, and this underscores the power of this technique for glycomic analysis (94).
Following the example of work enabling comparative proteomic profiling, isotopic labeling strategies have recently been developed for comparative glycomics by mass spectrometry. In brief, glycans are differentially labeled with isotope tags, mixed and analyzed via mass spectrometry (Figure 6, panel a). Labeling of the glycans can be achieved via either chemical or metabolic means. Chemical derivatization methods take advantage of the unique chemistry at the reducing end of a cleaved glycan, namely the cyclic hemiacetal which is in equilibrium with the open-chain aldehyde (Figure 6, panel b). Several different labeling reagents have been developed for the incorporation of an isotopic tag. The simplest of these is the reduction of the open-chain aldehyde by a labeled reducing agent such as deuterated sodium borohydride (98). Reductive amination using 13C labeled aniline has also been used to selectively derivatize glycans from serum samples and obtain relative concentrations of sugars (99). Differential labeling of samples is also achieved with deuterated hydroxyl amines which can react with cleaved glycans to form the corresponding oximes (100). Relative quantitation of either O-linked or N-linked glycans can be obtained using these methods (98, 100–105). Recently metabolic labeling of glycans has been achieved by a clever new strategy called IDAWG (isotopic detection of aminosugars with glutamine). In this method, glutamine containing heavy or light isotopes of nitrogen in the amide of the side chain is incubated with cells. Since the amide nitrogen is the sole nitrogen source in the synthesis of UDP-GlcNAc, this isotope is subsequently incorporated to GlcNAc, GalNAc and sialic acid residues (Figure 7, panel a). Comparative glycomic profiles of cells grown in the light and heavy medium are then obtained (Figure 7, panel b). This approach has been used to differentially label and visualize glycan epitopes during differentiation of embryonic stem cells (106). Given that these sugars are found in both N-linked and O-linked glycans, this technique holds great promise for tracking dynamic changes in glycan synthesis and degradation.
Although mass spectrometry can be a powerful technique for glycomics, there are several issues that are yet to be resolved. Because of the need to simplify and purify the glycome prior to analysis mass spectrometric techniques often provide a snapshot of a select portion of the glycome. Most commonly only the N-linked glycome is analyzed as in the example given previously for hepatocellular carcinoma (94). Relatively few methods exist for the identification of glycans from GAGs (107, 108), and glycolipids (109, 110). Thus, these portions of the glycome are currently underrepresented in the glycomic data set. A second major issue for mass spectrometry as a high-throughput glycomic technique is the fact that further analysis is typically required to obtain linkage information from a mass spectrometric glycan fingerprint (85). Annotation of the N-linked glycome from a simple spectra typically relies upon assumptions based on glycan biosynthetic pathways, limiting these analyses to well characterized biological systems.
An alternate method for glycomic analysis is based on the separation of oligosaccharides using HPLC. This technique relies on the unique chemistry of carbohydrates to label and visualize their separation. Typically, N-linked glycans are released using PNGase F, tagged at the reducing end with a fluorescent label such as 2-aminobenzamide or anthranilic acid, and analyzed using HPLC. The differential HPLC traces generated can be used to compare the general N-linked glycomic profiles of multiple samples. This methodology has been used for examining the N-linked glycan pools in samples such as serum (111, 112). A recent study utilized this method to observe high variability in the human N-linked glycome, providing initial evidence for the role of genetics and environmental factors in the composition of human glycome (112). A caveat to this method is that the HPLC profiles can contain multiple glycans in each peak and thus changes in the HPLC profiles are difficult to interpret at the level of individual glycan structures. Further analysis via enzyme degradation and sophisticated data interpretation is needed to obtain more specific structural information from this technique. The development of automated tools for annotation of HPLC-derived glycomic profiles and new glycan HPLC standards may help to offset some of these disadvantages (113).
A simple yet powerful new technology for comprehensive glycan analysis, lectin microarrays, consist of a panel of lectins with distinct binding specificities that are spotted onto a solid support. The microarray is then interrogated with fluorescently labeled analytes and the known binding specificities of lectins are used to interpret the resultant pattern and annotate glycan epitopes present in the sample of interest (Figure 8) (114–116). Glycan epitopes from a variety of samples including glycoproteins (116–119), cellular membranes (120), whole mammalian cells (121, 122), pathogenic bacteria (115, 123), and viruses (124) have been profiled using this technique.
Our laboratory has advanced this technology further by developing a sensitive and robust two-color approach for the semi-quantitative analysis of glycans from complex mixtures (Figure 9). In brief, the sample and a common biological reference are differentially labeled with Cy3- and Cy5- dyes respectively, mixed in equal amounts, and used to probe a lectin microarray. Competitive binding between the two analytes gives ratiometric data. The use of a common reference addresses issues arising from differences in sample labeling and inherent differences in lectin activity that can arise from minor alterations in the print conditions and variations observed with different lectin preparations (120). Moreover, the direct comparison of multiple samples to a single internal standard allows subtle changes in levels of glycans to be unraveled using this approach, in contrast to the more commonly used single color approach (120, 125). Recently, we utilized our two-color method for the comparative glycomics of whole HIV-1 virions and immunomodulatory nanoparticles called microvesicles. This analysis revealed that the glycomes of HIV-1 and microvesicles derived from T-cells are nearly identical, lending strong support to the theory that HIV-1 co-opts the microvesicular exocytic mechanism to exit T-cells (124).
Lectin microarray technology has several advantages over other glycan analysis techniques. Unlike analysis by mass spectrometry, it is possible to simultaneously observe glycolipids and N- and O-linked oligosaccharides using the array (120). Additionally the lectin microarray format is multivalent, mimicking the physiologically relevant context of glycan-protein interactions. Because the binding specificities of many of these lectins have been extensively characterized using carbohydrate arrays (see below), it is possible to obtain linkage-specific information on the glycome, which is another advantage of the arrays. Lectin microarrays can be easily fabricated and analyzed using widely available equipment (114, 115). In addition, there are ~100 commercially available lectins, the majority of which are naturally purified from plant sources and are currently used in the array format (9, 14, 126, 127). These factors combine to make lectin arrays a powerful technique for deconvoluting the glycome.
It should be noted, however, that this technology still have several drawbacks. First, glycan structural elucidation is restricted to the motifs that are recognized by the lectins on the array. Thus, the availability of carbohydrate binding proteins is a limiting factor. For example, there is a dearth of carbohydrate-binding proteins that are known to be specific for unique bacterial glycans; therefore these epitopes are invisible to current lectin microarrays. Additionally, some of the plant lectins are glycosylated which poses a potential problem when the array is probed with samples that may contain endogenous lectins (128). Currently, we and others are using a wide variety of approaches including recombinant lectins (128), the creation of synthetic receptors (129) and the evolution of new lectin epitopes in efforts to increase the number of unglycosylated carbohydrate binders for the detection of diverse glycan epitopes (128, 130, 131). A final drawback is that unlike the more detailed analysis that can be obtained by higher order mass spectrometry, this technique allows the identification of structural epitopes but not unique molecular structures. However, at present it is the specific epitopes, rather than individual structures, that are thought to encode biological activity and this methodology provides a rapid method of assessing those epitopes for a large number of samples.
Lectins have also been used as a probe for detecting changes in glycosylation on target proteins displayed in an array format. There are two variations of this technology. The first uses antibody arrays to pull down specific glycoproteins and then assays these proteins for glycan changes using lectins and glycan-specific antibodies (Figure 10, panel a) (132). In recent work, this technique was used to detect alterations in the glycosylation of MUC1 and CEA (carcinoembryonic antigen), both potential cancer biomarkers, in sera from pancreatic cancer patients (132) and to investigate cancer-related changes in glycosylation of alpha1-β-glycoprotein (133). In the second microarray variation, glycoproteins are printed directly onto the array and assayed. In brief, glycoproteins from samples (typically serum) are isolated by lectin affinity chromatography and then fractionated using HPLC (Figure 10, panel b). These fractions are spotted in array format and probed with select lectins to identify potential glycan changes. Glycoprotein fractions that display differences in the cancer state are resolved further and the individual proteins responsible for the differential glycosylation are identified using mass spectrometry (134–136). Candidate glycoprotein biomarkers displaying increased sialylation and fucosylation in colorectal cancer samples were identified using this technique (136). These methods, while useful for specific biomarker discovery, are not meant as a general method to survey the glycoproteome. They focus on identifying specific portions of the glycoproteome, restricted by either the anti-glycoprotein antibodies arrayed or the limited number of lectins with which the glycoproteome is surveyed for changes. Thus, they are targeted for a select purpose, namely biomarker discovery and validation. In general, the use of microarray technologies is not only advancing translational biomarker research but is also transforming the field of glycan structural characterization via the lectin microarrays, and provides new technology for dissecting the intimate relationship between the structural and functional aspects of glycans.
To develop a more comprehensive understanding of the glycome we must explore the fundamental interactions by which the glycome is decoded i.e. the binding of proteins to glycans. Techniques for the study of glycan-protein interactions include surface plasmon resonance (SPR), frontal affinity chromatography (FAC) and the more traditional ELISA methods (29, 137–139). A new technology, the glycan microarray, has rapidly advanced our knowledge of glycan-binding proteins by enabling the simultaneous evaluation of multiple glycan-protein binding interactions. These arrays consist of oligosaccharides from chemical, enzymatic and natural sources spotted in a high density format. Immobilization of the glycans onto a solid support is most commonly through a covalent linkage. Chemistries used for this purpose include the covalent coupling of thiols and maleimides, NHS-esters and amines, and amines and epoxides, among others (for comprehensive reviews regarding this topic, see (14, 140–146)). Glycan display is multivalent, enabling detection of glycan-protein interactions that are often weak at the monovalent level. Interactions of the glycan-binding protein (GBP) with the array are typically detected via fluorescence, where the fluorophore is either directly attached to the GBP, or is on a secondary reagent used to detect the GBP (Figure 11). Carbohydrate microarrays have been used to characterize a diverse range of GBPs (138, 147–154) including those involved in viral pathogenesis and immune recognition (155–157). For example, recent work using carbohydrate microarrays to characterize the binding of mutant influenza viruses showed that mutations previously found in pandemic strains, when introduced into current avian influenza strains of H5N1, cause a shift in glycan binding towards human sialic acid epitopes, thus demonstrating the increased propensity of these new H5N1 viruses towards the generation of a pandemic strain (158). For comprehensive reviews on the use of glycan arrays in biomedical applications see (159–161).
The multivalent presentation of carbohydrates is one of the attractive features of the array format. It has long been known that subtle differences in carbohydrate presentation and density can alter binding specificities of carbohydrate binding proteins (162–164). Recent work has shown that varying glycan density within an array format can reveal higher order binding patterns in seemingly identical GBPs, confirming the strong influence of density on glycan recognition (165). This study underscores the need for incorporating multiple presentations of the same glycan epitope in the array format to obtain biologically relevant specificities.
At present carbohydrate microarrays are limited by the variety of epitopes presented and the mode of glycan presentation. In spite of the tremendous improvements in carbohydrate synthesis, synthesis of glycans is still a major limiting factor in expanding this array technology as it both time consuming and difficult (145, 166, 167). In addition well defined molecular architectures for carbohydrate display on a solid support are still needed to better represent the variety of glycan presentations found in nature in the context of the array.
The challenge of glycomics is being met with an arsenal of novel analytical tools which are revealing the structure and functions of these crucial biopolymers. These novel technologies have generated an immense amount of glycomic information in a short period of time. This is necessitating the development of bioinformatic tools for data integration. Publicly available databases include those from the Consortium for Functional Glycomics (CFG) (168), KEGG (Kyoto Encyclopedia of Gene and Genomics) (169) and Glycosciences.de (170). Although these provide an initial framework for including glycomic information in databases, there is still a long way to go in the development of advanced databases and algorithms which will allow scientists rapid access information generated via different glycomic tools across multiple labs. Generation of parallel glycomic, genomic and glycoproteomic datasets are becoming possible, bringing the opportunity to integrate these datasets to gain an understanding of the fundamental control mechanisms underpinning the glycome and ultimately to model the glycome from genomic information (22–24). Given the importance of carbohydrates in various aspects of biology ranging from microbial pathogenesis to cell differentiation, the rapidly evolving area of glycomics has the potential to usher in a new understanding of biological recognition that will have a strong impact on diagnostic and therapeutic medicine.
L.K.M. would like to acknowledge funding from the NIH (Grant: 1 DP2 OD004711-01).