|Home | About | Journals | Submit | Contact Us | Français|
Mass spectrometry continues to play a vital role in defining the structures of N- and O-glycans in glycoproteins via glycomic and glycoproteomic methodologies. The former seeks to define the total N- and/or O-glycan repertoire in a biological sample whilst the latter is concerned with the analysis of glycopeptides. Recent technical developments have included improvements in tandem mass spectrometry (MS/MS and MSn) sequencing methodologies, more sensitive methods for analysing sulfated and polysialylated glycans and better procedures for defining sites of O-glycosylation. New tools have been introduced to assist data handling and publicly accessible databases are being populated with glycomics data. Progress is exemplified by recent research in the fields of glycoimmunology, reproductive glycobiology, stem cells, bacterial glycosylation and non-mucin O-glycosylation.
We last reviewed advances in the areas of glycomics and glycoproteomics and the rise in prominence of international glycobiology consortia in the 2006 volume of this journal . In this updated review, focusing on the period of early-2007 to mid-2009, we first describe key advances in mass spectrometry (MS) and glycoinformatics. This is followed by brief overviews of some of the most active areas in which glycomic/glycoproteomic methodologies are being applied to the study of biologically significant glycoconjugate structure/function relationships. To set these activities in context, an overview of current glycomic and glycoproteomic strategies is presented in Figure 1.
A driving force of recent advancement in the field of structural glycobiology has been on-going improvements and diversification in MS hardware. Notably, the increased utilization of matrix-assisted laser desorption ionisation tandem time-of-flight (MALDI-TOF/TOF) instrumentation has dramatically increased performance in terms of upper mass range, resolution, sensitivity and signal to noise ratios. As demonstrated by a re-evaluation of human neutrophil glycosylation, high mass signals beyond m/z 6500 were observed and structurally informative MS/MS data were generated from signals in the region of m/z 4000 . Indeed utilization of such technology has allowed us to observe N-glycan structures at above m/z 13,000 (data not shown). Also, the ability of ion trap instrumentation to facilitate MSn experiments, especially on glycans which have been derivatised by permethylation, is allowing unambiguous structural assignment of isomeric glycans . The application of alternative methodologies to induce fragmentation of glycopeptides such as electron capture dissociation (ECD) and electron transfer dissociation (ETD) is allowing the direct mapping of sites of both N- and O-glycosylation .
Standardization and quantitation continue to be key objectives of glycomic analyses. The Human Disease Glycomics/Proteome Initiative (HGPI; URL:http://www.hgpi.jp), sponsored by the Human Proteome Organization (HUPO) continues to take a leading role in such efforts. A second pilot study utilizing human IgA to assess methodologies for O-glycan analysis has now been undertaken. Two general strategies, direct MS analysis of mixtures of permethylated reduced glycans in the positive ion mode and analysis of native reduced glycans in the negative ion mode using liquid chromatography-MS (LC-MS) approaches, were found to give the most reliable data . Exploitation of the ultra high resolving power of Fourier transform ion cyclotron resonance (FTICR) MS allows quantitative comparative glycomic analysis by exploiting the small difference in mass generated by differential permethylation with either 13CH3I or 12CH2DI .
The exploitation of MS instrumentation developments in conjunction with continued advancements in supporting technologies such as liquid chromatography has allowed previously refractive glycobiological structural issues to be tackled. Glycans with sulfo-modifications constitute important recognition codes in cell adhesion. Tandem high energy and multiple-stages of low energy collision induced dissociation (CID) MS/MS sequencing in conjunction with microscale permethylation derivatization and front end chromatographic separation were utilized to allowed the identification of N-glycans bearing the 6-sulfo sialyl Lewisx epitope which plays a role in L-selectin-dependent lymphocyte homing and recruitment . Recent breakthroughs have also been made in the structural analysis of other highly charged glycoconjugates such as glycosaminoglycans  and those with polysialylation . The high levels of heterogeneous O-glycosylation often associated with mucins are vital for their functions but have made structural characterization difficult. The application of nanoLC separation of native β-eliminated O-glycans on graphitized carbon columns coupled to electrospray (ES) ion trap instrumentation operated in the negative ion mode has allowed low pmole levels of sensitivity and is affording new levels of understanding of both human gastrointestinal  and respiratory track mucins . An alternative, similarly sensitive, strategy for mucin analysis involves direct glycomic profiling of mixtures of permethylated glycans. This approach was recently employed to characterise changes in O-glycosylation of mucin rich tissues from the gastro-intestinal tract of Core 2 knockout mice .
Glycomics, probably even more than any other current “-omic” field of study, places a great burden of data upon the researcher, necessitating informatics solutions. The computational challenges and approaches being employed to address this by no means simple task are summarised in two excellent reviews [13,14″]. In essence, the ever increasing complexity and quantity of data that are routinely produced from the glycomic analysis of cells and tissues requires support, in the form of data repositories and software tools, to facilitate faster analysis and more meaningful interrogation of the combined glycomic system.
To this end, databases of complex glycan structural data were born from the ashes of CarbBank, the five most prominent publically available examples of which are the Consortium for Functional Glycomics’ (CFG) relational database (http://www.functionalglycomics.org/glycomics/common/jsp/firstpage.jsp), the Kyoto Encyclopedia of Genes and Genomes glycome informatics resource (KEGG GLYCAN) (http://www.genome.jp/kegg/glycan/), the Japan Consortium for Glycobiology and Glycotechnology DataBases (JCGGDB) (http://jcggdb.jp/index_en.html), Glycosciences.de (http://www.dkfz.de/spec/glycosciences.de/sweetdb/index.php) and EuroCarbDB (http://www.ebi.ac.uk/eurocarb/home.action). Of course, whenever databases are developed independently of each other, the problem of format and language will present itself. Each of the initiatives has developed its own standards, tools and databases, so as these projects grew (and the commercial and private access versions along with them) the issue of inter-compatibility was confronted and directly addressed by the glycoinformatics community. In response to this lack of standardisation, the Complex Carbohydrate Research Center (CCRC) at the University of Georgia has developed the GLYDE-II XML representation which is being accepted as the standard format for the exchange of carbohydrate structural data (CCRC Glyde-II; URL:http://glycomics.ccrc.uga.edu/core4/informatics-glyde-ii.html).
In order to better populate these databases, the development of software tools to aid the arduous process of data analysis and annotation is of great importance. The complexity of glycoconjugates and the variety of techniques employed in the elucidation of their structures present significant roadblocks to an integrated single-package solution for glycomic analysis. Nonetheless, there are an increasing number of algorithms and tools designed to support these experiments [13,14″]. Perhaps the most promising of the glycomic mass spectrometric interpretation approaches is Cartoonist , which has been adopted by the CFG and is designed to mimic the approach of a human expert in the annotation of MS data. Like many of the structure analysis programs, the Cartoonist algorithms are being continually refined to produce a more non-expert user-friendly experience, which will enable a greater proportion of the scientific community to utilise the data presented by the repositories in a meaningful way.
The closest to a complete glycomic MS analysis tool thus far developed is the GlycoWorkBench tool developed by the EuroCarbDB initiative . It provides support to the manual interpretation of MS glycomic data, incorporating an increasing number of user-friendly features designed to assist the researcher. These include a glycan structure editor, mass-based structure prediction, fragmentation prediction tools and the semi-automated assignment of MS/MS spectra. Due to its modular design, GlycoWorkBench even has the capacity to integrate related tools, such as Cartoonist, into its interface.
Of course, as the analysis of glycomic data becomes more accessible and more efficient through use of these informatic tools, the public databases (assuming they obtain sustainable funding) will continue to be populated with structurally-derived information, all of which will be accessible via the GLYDE-II XML exchange format. This broad and, most importantly, freely accessible base of data will begin to allow mining of the information in more and more meaningful ways, such as the prediction of glycan structures or potential bio-markers. Current and potential avenues of research in this area are discussed at length in the aforementioned pair of reviews [13,14″], which also include useful reference information and links concerning the vast majority of the current initiatives.
Cell surface glycans play diverse roles in the immune system and determining the glycomes of lymphocyte populations has become an important goal for understanding function . Glycomic analysis of murine and human immune cells is a major activity of the CFG and a considerable volume of glycomics data, complemented by gene expression data, is already available in the CFG’s public databases (CFG glycan profiling pages; URL:http://www.functionalglycomics.org/glycomics/publicdata/glycoprofiling.jsp). The reproducibility of glycomic methodologies has been clearly illustrated by the characterisation of two preparations of human neutrophils, sourced from the US and UK, respectively, which gave remarkably similar MALDI profiles over a 6,000 Da mass range . This study also exemplifies the ultra-high sensitivity of MS/MS technology which is capable of detecting trace levels of functionally important epitopes such as sialyl Lewisx in very complex mixtures of glycans.
Changes in glycosylation are known to occur during lymphocyte maturation and activation and these changes are believed to influence interactions with lectin receptors and subsequent signalling events. Until recently, little was known concerning the precise nature of glycan re-modelling in the immune system. However this is being rectified by the emergence of extensive collaborations between the glycobiology and immunology communities, as exemplified by studies of differentiation and maturation of human dendritic cells which have revealed changes in glycan expression affecting recognition by siglecs, galectins and selectins [19,20].
Arguably one of the hottest areas of recent glyco-immunology has involved mass spectrometric characterisation of the sialylation of N-glycans on the Fc domain of human IgG1 [21,22″]. This might seem surprising because IgG glycosylation has been studied countless times in the quarter of a century that has elapsed since the seminal discovery that alterations in glycosylation of total serum IgG are associated with rheumatoid arthritis and primary osteoarthritis . The fact that new biological discoveries are now being made is not because novel structures are being determined on IgG1. Rather it is because precise structural characterisation has been complemented by synthesis of a specific glycoform of the Fc fragment which has been exploited as a tool for investigating the basis of the anti-inflammatory activity of the intravenous immunoglobulin that is widely used to treat auto-immune diseases.
Tissues, cells and fluids associated with mammalian reproduction express glycan structures, for example Lewisy, that are rarely found elsewhere in the body and are often considered to be cancer antigens. These glycan sequences are thought to be immunomodulatory in cancer states and this activity is likely to be mirrored in reproductive phenomena that require immunosuppression such as toleration of the embryo in utero, and evasion of the female immune system by sperm. Considerable progress has been made during the review period in documenting mammalian reproduction-associated glycomes, as illustrated by work on human sperm [24″], uterine and genital tract fluids of the mouse [25″″], human glycodelins [26″″] and bovine pregnancy associated glycoproteins (PAGs) . The first two publications demonstrate that humans and mice share many terminal epitopes, with Lewis antigens being especially abundant. However there are some striking differences. Notably Lewisy is abundant in the male reproductive tract of humans and virtually absent in females. In contrast the uterine fluid of the female mouse is rich in this structure with none being detected in the seminal fluid glycome of the male mouse. As shown in Figure 2, the glycodelins and PAGs exemplify opposite ends of a spectrum of glycan diversity on an individual reproductive glycoprotein, the former having hundreds of different glycoforms, whilst a single tetra-antennary glycan is the dominant structure on the latter.
Stem cell glycomics is an emerging area of glycobiology. When the field was reviewed in 2007  the emphasis was on immunologically defined epitopes and very few structures had been rigorously defined using physicochemical methods. Since then the CCRC has received funding from the NIH for stem cell glycomics and some of their data are available in a public database (CCRC stem cell glycomics data; URL:http://glycomics.ccrc.uga.edu/core2/glycomics-data.html). Other workers have reported on the N- and O-glycomes and associated gene expression of human hematopoietic stem and progenitor cells and human mesenchymal stem cells [29,30].
Once thought to be restricted to eukarya, it is now firmly established that both Bacteria and Archaea are capable of protein glycosylation. Our understanding of bacterial N- and O-glycosylation is probably best exemplified by the pathogen Campylobacter jejuni . The expanding repertoire of sequenced bacterial genomes has revealed an abundance of genes with a propensity for glycosylation, with pathways shown to occur in other bacteria, such as O-glycosylation in Neisseria gonorrhoeae, whilst N-glycosylation is more wide spread in Archaea . The functional characterization of these bacterial glycosylation pathways has been greatly facilitated by mass spectrometry and NMR , which have proved to be powerful tools in the characterization of often novel sugars and modifications, such as the complementary ‘bottom up’ and ‘top down’ approaches (see Figure 3), assisted in some cases by metabolomics, that have been employed in unravelling the glycosylation processes [34,35]. An exciting new approach to better understand the prokaryotic glycome is being pioneered at the Centre for Integrative Systems Biology at Imperial College (CISBIC), where a multidisciplinary research team are combining ‘omics’ experimentation with mathematical and computer modelling to study the interaction dynamics between the bacterial glycome and host innate immune responses.
Although the majority of O-linked glycans are of the mucin-type which means that they are attached to serine or threonine via a GalNAc residue, an increasing number of cell surface and secreted glycoproteins are now known to carry glycans linked via O-Man, O-Fuc, O-Glc or C-Man (see Figure 4). Moreover many cytoplasmic glycoproteins are dynamically substituted with O-GlcNAc which shares features with phosphorylation . Mass spectrometry continues to play a pivotal role in helping to decipher the functions of these rare and/or low abundance glycosylations, many of which are implicated in signalling processes . The introduction of ETD as a complement to CID for MS/MS experiments (see Technical Advances Section), has greatly facilitated research in this field by providing a highly sensitive means for defining sites of glycosylation . The field of O-GlcNAcylation is arguably benefiting most from this technology because the challenges posed by the dynamic nature of O-GlcNAc glycosylation, its very low stoichiometry and the high chemical lability of the sugar-protein linkage, have conspired over many years to limit the number of defined O-GlcNAc sites to a very tiny fraction of the likely “O-GlcNAc-ome”. O-GlcNAc characterisation is also benefiting from improvements in enrichment strategies [39″,40] and quantitative phosphoproteomics [41″″]. With respect to the other classes of aforementioned glycans, significant developments during the review period include the discovery that the addition of O-Glc to proteins is catalysed by the product of the rumi gene , that O-fucosylation is required for the function of ADAMTS13 (a metalloprotease that cleaves von Willebrand Factor) , that Drosophila have novel glucuronyl O-Fuc glycans [44″], and that Peters Plus Syndrome is a new congenital disorder of glycosylation involving defective glucosylation of O-Fuc in thrombospondin Type 1 repeats (TSR) . TSRs can also carry C-linked mannose (C-Man) and it has been reported that this glycosylation affects signalling . Interestingly a viral glycoprotein has been shown to carry this unusual glycosylation .
Sugar sequences associated with O-Fuc and O-Glc are well defined (Figure 4), and C-Man and O-GlcNAc are known to be not further elongated. In contrast, many aspects of O-Man glycosylation remain enigmatic. A major carrier of O-Man is α-dystroglycan (α-DG) and defects in α-DG glycosylation are linked to congenital muscular dystrophies . There is compelling evidence for the presence of high molecular weight glycan polymers in α-DG whose biosynthesis requires the product of the LARGE gene. Despite considerable efforts in many laboratories over the past few years, neither the putative glycan polymers nor the function of LARGE is yet known. Nevertheless some new structural information on O-mannosylation has been gathered, including the discovery that in the mouse brain it can be substituted with sialyl Lewisx, an epitope that has thus far eluded physicochemical characterisation on other O-glycans, or, indeed, N-glycans, in the mouse . The cell adhesion molecule CD24 in the mouse brain has been recently shown to carry a diverse family of mucin-type and O-mannosyl glycans, the latter including sialyl Lewisx structures .
The review period has seen significant growth and advancement in the ability of mass spectrometric methods to analyse glycoproteins and their associated glycans. Technological innovations, such as MALDI-TOF/TOF and ECD/ETD, have broadened the understanding and application of glycan sequencing, while progress has also been made in the equally important areas of standardisation and quantitation. Similar levels of development are also apparent in the glycoinformatics field, with relational databases and analytical or data mining tools continuing to flourish under the auspices of the collaborative international consortia such as the CFG, KEGG and EuroCarbDB. The application of these techniques and tools to biological problems has expanded our knowledge in many areas, including the glycosylation of the mammalian immune and reproductive systems, as well as that of stem cells and bacteria.
The complexity and level of detail involved in glycoproteomic and glycomic research remain extremely high, with the more sensitive and accurate instrumentation and methodologies of today only serving to highlight this fact further. It is therefore vital that the open-access data policies pioneered by the collaborative international consortia be continued beyond their current funding periods, in order to preserve and continue to expand upon the excellent ground work that has been laid in the data repositories now available.
This research was supported by the Biotechnology and Biological Sciences Research Council (BBSRC) Grant Nos. BBF0083091, B19088 and BBC5196701, the Analytical Glycotechnology Core of the Consortium for Functional Glycomics (GM62116), the sixth European Union Research Framework Programme (EUROCarbDB RIDS Contract No. 011952) and the Wellcome Trust. Grateful thanks are due to Dr. Poh-Choo Pang for her efforts in producing Figure 2.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
Papers of particular interest, published within the period of review, have been highlighted as:
“ of special interest
“” of outstanding interest