|Home | About | Journals | Submit | Contact Us | Français|
Glycosylation plays fundamental roles in controlling various biological processes. Therefore, glycosylation analysis has become an important target for proteomic research and has great potential for clinical applications. With the continuous development and refinement of glycoprotein isolation methods, increasing attention has been directed to the quantitative and comparative aspects. This review describes the mass spectrometry (MS)-based techniques for the comparative analysis of glycoproteins and their applications to answer a wide range of interesting biological questions.
The recent explosion of proteomic research has created rich knowledge about the protein content of cells, tissues and whole organisms. The discovery of proteins with post-translational modifications (PTMs) has become an important frontier of proteomics studies, with more emphasis on the structures and functions of the proteins rather than interest only in sequence identifications in the early work. Emerging separation techniques coupled with mass spectrometry (MS) offer great capabilities in elucidating more information on proteins with PTMs. In addition, the development of methods that are capable of measuring the relative expression of proteins between two or more samples has become an essential aspect of systems biology, and has greatly facilitated biomarker discovery in various diseases.
Protein glycosylation has long been recognized as a common PTM. Glycosylated proteins are ubiquitous components of both extracellular matrices and cellular surfaces. There are four known categories of glycosylation: (1) the N-glycosylation, where glycans are attached to asparagine residues in a consensus sequence N-X-S/T (X can be any amino acid except proline) via an N-acetylglucosamine (N-GlcNAc) residue; (2) the O-glycosylation, where the glycans are attached to serine or threonine; (3) glycosylphosphatidylinositol anchors, which are attached to the carboxyl terminus of membrane-associated proteins and (4) C-glycosylation, in which sugars are attached to tryptophan residues in some membrane-associated and secreted proteins . As the first two cases have been the most common forms of glycosylation, we will limit our discussion to only N- and O-glycosylation in this mini-review. Carbohydrates can have a great influence on the physicochemical properties of glycoproteins, affecting their folding, solubility, aggregation and propensity to degrade. Furthermore, glycan chains in glycoproteins play fundamental roles in many biological processes such as embryonic development, immune response and cell-to-cell interactions involving sugar–sugar- or sugar–protein-specific recognition . Consequently, aberrant glycosylation has now been implicated in many diseases, including hereditary disorders, immune deficiencies, neurodegenerative diseases, cardiovascular diseases and cancer . Many clinical biomarkers and therapeutic targets are glycoproteins, including Her2/neu (breast cancer), prostate-specific antigen (PSA, prostate cancer) and CA 125 (ovarian cancer) [4, 5].
In order to examine the disease-related glycosylation alteration, sensitive, fast and robust analytical methods are required. Although the identification of proteins is routinely performed in many laboratories, the study of glycoproteins remains challenging. Comprehensive reviews have been published in recent years covering the isolation and characterization of glycoproteins [6–9]. Therefore, the focus of this review will be on recent development of glycoproteomics, with an emphasis on the quantitative aspect, and its application in tackling biological problems.
Characterization of glycosylated proteins requires their isolation from complex biological samples that contain both nonglycosylated and heterogeneously glycosylated proteins. In general, glycoproteins can be purified by most conventional protein fractionation approaches, including various forms of HPLC, such as ion exchange, hydrophobic interaction, size exclusion and affinity chromatography, and electrophoresis separation. Specifically, affinity purification has been achieved by using lectins or antibodies that are specific for certain glycan structures and this methodology has been widely utilized in MS-based proteomics studies. Recently, chemical methods have also been developed to accomplish selective isolation, identification and quantification of glycoproteins and glycopeptides.
Since the proteomics community has become genuinely interested in the changes of proteins under different biological conditions, MS-based quantitation methods have gained increased popularity and played significant roles in functional proteomics and biomarker discovery over the past several years. Typically, quantitation can be achieved in either of the two ways: (i) in ‘front end quantitation’, isotopic labels are incorporated either chemically or enzymatically to create a specific mass tag before MS analysis that serves as the basis for relative quantitation [10–12]; (ii) in ‘back end quantitation’, label-free approaches are performed by either comparing the signal intensity of peptide precursor ions belonging to a particular protein , or counting the number of tandem MS fragmentation spectra identifying the peptides of a given protein .
Lectins are a class of proteins isolated from plants, fungi, bacteria and animals that have a unique affinity towards carbohydrates. Lectin affinity chromatography is based on a reversible, specific interaction of each lectin against different oligosaccharides. Therefore, this method not only allows the isolation and enrichment of glycoproteins and glycopeptides, but also enables discrimination of glycan structures among different proteins and different glycoforms of the same protein. The commonly used lectin affinity chromatography protocols involves immobilization of lectins onto various forms of solid supports such as agarose and silica in a number of chromatographic formats, including tubes, columns and microfluidic channels [15, 16]. The study of Kaji et al.  represents one of the earliest efforts to incorporate isotopic labelling with lectin affinity chromatography. Their approach, termed isotope-coded glycosylation-site-specific tagging (IGOT), is based on sequential procedure involving the capture of glycopeptides by lectin affinity chromatography, followed by peptide-N-glycosidase (PNGase) mediated incorporation of 18O into the N-glycosylation site and LC–MS/MS analysis.
Various forms of quantitation methods have been explored to couple with lectin affinity enrichment. The incorporation of the isotope tag not only allows specific mapping of glycosylation sites, but can also be used for quantitative profiling based on a principle similar to that of isotope-coded affinity tag (ICAT) . However, a recent study suggested a potential pitfall in 18O-based N-linked glycosylation site mapping that the trypsin used for proteolysis remained active after several steps of sample treatment and led to the incorporation of 18O into the C-termini of the peptides during the deglycosylation step . The database search algorithm could subsequently confuse it with an 18O-labelled Asp residue near the C-terminus of a peptide, which resulted in numerous false-positive identifications. Qiu and Regnier developed an extended strategy called serial lectin affinity chromatography (SLAC), which fractionates oligosaccharides or glycopeptides into structurally distinct groups using a series of different lectins with precisely elucidated binding specificities. Via the incorporation of isotopic labelling of the glycopeptides before deglycosylation, one can recognize and quantify differences in the degree of branching between sialic acid-bearing glycan isoforms from specific glycosylation sites on proteins through differential labelling . Alternatively, Plavina et al.  adopted a label-free method, applying extracted ion chromatogram as a measure of the relative abundances of the peptides. They applied their comparative glycoproteomic approach to the biomarker study of psoriasis, and further validated their results by label-free quantitation via ELISA measurements. The common workflows of quantitative glycoproteomics using lectins are illustrated in Figure 1.
Numerous advantages of the lectin affinity approach include its simplicity and cost-effectiveness. Additionally, it is flexible and can be used either in combination or in series. Weaknesses of this strategy do exist, including that the selectivity of some lectins are not well-defined, and non-specific bindings to nonglycosylated proteins often occur.
In addition to the affinity separation approaches, glycoproteins can also be isolated on the basis of their chemical reactivity. Towards this end, Zhang et al. proposed a method that enables selective isolation, identification and quantification of N-glycosylated peptides based on hydrazide chemistry, stable isotope labelling and the specific release of formerly N-linked glycopeptides via PNGase F . The chemical principle of this method lies in the conversion of the cis-diol groups of carbohydrates to aldehydes by oxidation, followed by the coupling to hydrazide groups immobilized on a solid support. This method was tested with human serum and prostate cancer cell membrane samples and showed great selectivity towards N-glycoproteins and high efficiency of quantitation. Similarly, glycoprotein enrichment can also be achieved by reaction with boronic acid immobilized on functionalized magnetic particles . Boronic diesters, which are stable under basic conditions, can be formed by the reaction of germinal diols present in mannose, galactose or glucose, with boronic acid. One unique strength of these approaches is that both N-linked and O-linked glycoproteins are conjugated to solid support via covalent linkage without bias towards any particular structures.
Releasing N-glycosylated peptides is straightforward by using PNGase F, while releasing O-linked glycopeptides requires a panel of exoglycosidases to sequentially remove monosaccharides until only the core structure remains attached, which can then be removed by O-glycosidase. Due to this reason, a chemical method, such as β-elimination, that is more effective in removing formerly O-linked glycans will be more desirable for many studies. Vosseller et al.  developed an approach called BEMAD—β-elimination followed by Michael addition with dithiothreitol (DTT), which allowed MS-based identification and comparative quantitation of O-phosphate or O-GlcNAc-modified peptides. BEMAD involves differential isotopic labelling by normal DTT (d0) or deuterated DTT (d6) through Michael addition and enrichment of these peptides by thiol chromatography. Reduction reaction catalysed by N-acetyl-hexosaminidase was measured by isotopic labelling and differentiated specific sites of O-GlcNAc from those of O-phosphate.
Recently Khidekel et al. developed an improved GlcNAc-specific labelling strategy termed quantitative isotopic and chemoenzymatic tagging (QUIC-Tag) , which relies on specific modification of proteins containing a terminal GlcNAc moiety with a β-1,4-galactosyltransferase that has been engineered to transfer a ketone-containing-galactose to the C4 hydroxyl of a GlcNAc. The ketone then becomes the tagging target of an aminooxy biotin derivative for the purpose of enrichment and identification and primary amines of the peptides are labelled by isotopic formaldehyde via reductive amination  for quantitative MS analysis (Figure 2). One of the unique strengths of the QUIC-Tag strategy is the use of electron-transfer dissociation (ETD), a relatively new fragmentation method based on radical initiated backbone cleavage . The advantage of ETD is its ability to retain labile modifications that allows the identification of exact sites of glycosylation, which is often not possible with the traditional collision-activated dissociation (CAD) fragmentation technique that cleaves at the labile PTM bonds prior to fragmentation along the peptide backbone. By combining the chemoenzymatic reaction with novel instrumentation methods, QUIC-Tag is able to offer the best strategy in O-linked glycoprotein identification in terms of enrichment, specificity, site determination and quantitation.
Glycoproteomics is currently experiencing a rapid growth both in terms of methodologies and the range of applications facilitated by these novel approaches and advancements in instrumentation. The large-scale comparative glycoproteomic analyses have gained increasing attention due to two major reasons: (i) Functionally, the oligosaccharide moieties of various glycoproteins act as selectivity determinants, playing a fundamental role in many biological processes such as immune response and cellular regulation because cell-to-cell interactions involve sugar–sugar- or sugar–protein-specific recognition. Studying the profiles of the glycoproteins is likely to provide critical information regarding the roles they play in a particular biological system and will shed light to the mechanism and pathogenesis of certain diseases. (ii) The current bottleneck of discovering biomarkers in biofluids such as serum using MS is its limited dynamic range of detection compared to a much larger range of protein concentrations in the samples. Targeting at a subset of the whole proteome, such as glycoproteome, can be an effective solution to simplify the sample and lower the detection limit. Additionally, the aberrant glycosylation patterns might provide clues to disease-relevant biomarkers. In this section we review the recent applications of quantitative glycoproteomics in several important research fields.
It has been known that the glycosylation profiles change significantly during oncogenesis [28, 29]. For example, an increased activity of N-acetylglucosaminyltransferase V, an enzyme responsible for the formation of branching N-linked glycans, has been linked to tumour invasion and metastasis in several cancers [30–32]. Therefore, the tumour-secreted glycoproteins can serve as potential targets for biomarker discovery for diagnostics. One of the best defined cancer biomarkers is PSA, a secreted glycoprotein with one defined N-linked glycan chain. PSA is primarily secreted by prostatic epithelial cells into the seminal plasma and the glycoforms of PSA from prostate cancer patients have been shown to differ from those of healthy controls . Moreover, tumour-specific alteration of glycan structures could be potential targets for cancer immunotherapy, such as epitopes for therapeutic monoclonal antibody .
The most widely applied quantitative glycoproteomics strategy in cancer biomarker discovery involves lectin affinity chromatography. Blood plasma is the primary source for the research because of its richness in secreted proteins and the easy accessibility of the sample compared to diseased tissues. Immunodepletion of the several most abundant proteins, including albumin and immunoglobulin, is optional but usually helpful in reducing the concentration dynamic range of the sample . Since the aberrant addition of α-1,6-fucose on the core GlcNAc has been shown in multiple types of cancers [36–38], fucosylated proteins have served as major targets for the cancer biomarker research. For example, Xiong et al. conducted comparative analysis of the α-l-fucose containing tryptic glycopeptides with differential labelling with d0- or d6-succinimidyl acetate, followed by enrichment with immobilized lectin Lotus tetragonolobus agglutinin (LTA) . Their method was applied to a study of lymphosarcoma in dogs, and it was found that a series of fucosylated proteins in the blood decreased in concentration by more than 2-fold during chemotherapy. Of the proteins identified, CD44 and E-selectin are known to be involved in cell adhesion and cancer cell migration. Similarly, Ueda et al.  specifically enriched α-1,6-fucosylated peptides in immuno-depleted human serum sample using Lens culinaris (LCA) lectin column and revealed 34 candidate biomarker glycoproteins for lung cancer by quantitative proteomic analysis using 12C(6)- or 13C(6)-NBS (2-nitrobenzensulfenyl) stable isotope labelling followed by MALDI-QIT-TOF MS analysis. Comunale et al.  employed both glycomics and targeted glycoproteomics to investigate not only the changes in protein concentrations, but also the levels of fucosylation in liver cancer. In total, 19 proteins were found to be hyperfucosylated in cancer. Zhao et al.  took a different quantitation approach to search for pancreatic cancer biomarkers. In their study, sialylated glycoproteins from normal and cancer sera were extracted by three different lectins and fractionated by nonporous silica reverse phase (NPS-RP) HPLC. The UV absorption of intact proteins with the HPLC provided a reproducible means to quantify the expression of glycoproteins. As a result, sialylated plasma protease C1 inhibitor and the N83 glycosylation of α-1-antitrypsin were found to be down regulated in cancer serum. Lubman and coworkers identified plasma glycoproteins with aberrant glycosylation via a combination of lectin glycoarray, statistics and LC–MS/MS, and moved their colorectal cancer biomarker research one step further by validating the biomarker candidates by lectin blotting in an independent set of samples . The potential biomarkers for colorectal cancer diagnosis included elevated sialylation and fucosylation in complement C3, histidine-rich glycoprotein and kininogen-1.
In addition to the lectin affinity chromatography approach, chemical methods have also been employed in glycoprotein analysis in cancer biomarker research. For example, Soltermann et al.  applied hydrazide solid phase chemistry to capture the glycopeptides from malignant pleural effusions of patients with lung cancer and controls, and was able to access the moderate to low protein concentration range (μg/ml to ng/ml) with the identification of several proteins associated with tumour progression or metastasis, such as CA-125, CD44, CD166, lysosome-associated membrane glycoprotein 2 (LAMP-2), among others. Sun et al.  utilized the same chemistry and demonstrated the utility of this approach to study the membrane proteins of the microsomal fraction from a cisplatin-resistant ovarian cancer cell line that is rich in membrane proteins. They improved the hydrazide capture method by using sodium sulfite as a quencher to replace the solid phase extraction step in earlier studies for removing excess sodium periodate, which allows the overall capture procedure to be completed in a single vessel.
In addition to cancer, glycoproteomics approaches have also found widespread applications in neurodegenerative disease research, with the goals to study the mechanism and to diagnose the diseases. Aberrant glycosylation changes have been shown to occur in Alzheimer's disease (AD). Liu et al. [44, 45] have shown that aberrant glycosylation may modulate tau protein at a substrate level so that it is easier to be phosphorylated and more difficult to be dephosphorylated at several phosphorylation sites in AD brain. Small and coworkers identified glycosylated isoforms of acetylcholinesterase and butyrylcholinesterase that are increased in AD cerebrospinal fluid (CSF) . Moreover, glycosylation patterns have been found to be altered in other neurodegenerative diseases. For example, Reelin, a glycoprotein that is essential for the correct cytoarchitectonic organization of the developing central nervous system (CNS), is up-regulated in the brain and CSF in several neurodegenerative disorders, including frontotemporal dementia, progressive supranuclear palsy, Parkinson's disease (PD) as well as AD . Furthermore, glycosylation patterns of Reelin differ in plasma and CSF, and the CSFs of control and diseased samples also exhibit different glycosylation patterns. These results support that glycoprotein Reelin is involved in the pathogenesis of numerous neurodegenerative diseases.
Glycoproteins present in CSF can be a great source of biomarkers of neurodegenerative diseases because changes in CSF composition can reflect the on-going disease conditions in the brain. In a preliminary study using two-dimensional gel electrophoresis (2D-GE), several glycoproteins, including apolipoprotein E, clusterin and α-1-β-glycoprotein were altered in the CSF from AD patients . Similarly, Sihlbom et al.  used 2D-GE stained with Pro-Q Emerald 300 to compare the glycoproteomes of CSFs from AD patients and control. The glycopeptides of differentially expressed glycoforms were subject to fragmentation with infrared multiphoton dissociation (IRMPD) on a Fourier transform ion cyclotron resonance (FTICR) mass spectrometer, which offers abundant fragment ions through breakage at the glycosidic linkages with limited dissociation of the peptide backbone and excellent mass accuracy enabling the structural determination of site-specific N-glycosylation. In their follow-up study, albumin depletion was performed prior to 2D-GE analysis to enhance glycoprotein concentration for image analysis. As a result, one isoform of α-1-antitrypsin showed decreased glycosylation in AD patients while protein expression levels of apolipoprotein E and clusterin were increased. Compared to CSF samples, biomarker discovery in blood imposes a greater challenge due to the huge dynamic range of protein concentrations. Wei et al.  applied comparative glycoproteomics to prion disease biomarker discovery by employing lectins to enrich the glycoproteins and remove the abundant nonglycoproteins from mouse plasma sample, followed by multidimensional separation of isotopically labelled tryptic peptides via reversed phase HPLC under different pH conditions. As a result, 280 glycoproteins were identified, among which 49 proteins exhibited more than 2-fold changes in the blood from mice infected with prion disease.
Recently, O-GlcNAc, an O-linked glycosylation analogous to phosphorylation, has become the target of studies in neurological systems. For example, lectin weak affinity chromatography (LWAC) has been used to study in vivo O-GlcNAc from a postsynaptic density preparation . Because relatively poor fragmentation in traditional CAD is usually observed for O-GlcNAc modified peptides due to the preferential dissociation of labile O-GlcNAc, an alternative fragmentation method electron capture dissociation (ECD) on a hybrid linear ion trap-Fourier transform ion cyclotron resonance (LIT-FTICR) mass spectrometer was used for its ability to preserve labile PTMs . The effectiveness of this strategy for complex peptide mixture analysis was demonstrated through enrichment of 145 unique O-GlcNAc-modified peptides, 65 of which were sequenced and belonged to proteins with diverse functions in synaptic transmission. The combination of this work and an accompanying report  on the phosphoproteome of postsynaptic density preparations suggests complex protein regulation at the synapse through the potential interplay of these PTMs. Importantly, Khidekel et al.  applied their newly developed QUIC-Tag method to cultured cortical neurons and in vivo-stimulated rodent cerebral cortex. For the first time, their approach reveals that while certain sites of glycosylation undergo significant changes in occupancy in response to particular stimulus, other sites remain virtually unchanged. This dynamic differential modulation suggests that O-GlcNAc occurs reversibly in neurons, and may have important roles in mediating the communication between neurons in a fashion analogous to that of phosphorylation.
Because of the ubiquitous nature of glycosylation and its widespread involvement in many physiological processes, glycoproteomics have found applications in other fields such as microbiology, diabetes and plant biology, just to name a few. However, most of those studies focused mainly on glycoprotein identification or glycan structure determination, whereas only a few took a quantitative approach.
The work of Atwood et al.  on Trypanosoma cruzi represents the first effort of glycoproteomic analysis of a human pathogen. Through the glycopeptide enrichment by lectin affinity chromatography from subcellular fractionation and isotopic labelling of the glycosylation sites with H218O, 36 glycosylation sites from 29 glycoproteins were unambiguously identified. More recently, Mehta et al.  conducted a quantitative study with the sera of hepatitis C virus-infected individuals. Using comparative glycoproteomics, they have observed increased abundance and the level of fucosylation of galactose-deficient anti-Gal immunoglobulin G (IgG) in serum upon the development of liver fibrosis and cirrhosis. This alteration in anti-Gal IgG allowed the development of a plate-based test to quantify the changes by the fucose binding lectins.
Interestingly, Hincapie and coworkers adopted a typical two step glycoproteomic protocol, which combines abundant protein depletion and multi-lectin affinity chromatography to remove the glycoproteins instead, followed by the study of changes in the level of unbound fraction in sera from patients with obesity, diabetes and hypertension diseases . The sample complexity was greatly reduced by this procedure. After over 90% of the total protein mass was removed by the immunodepletion column that targets the highly abundant proteins, about 56% of the remaining proteins were eluted in the unbound fraction. The label-free spectral counting approach was employed in this study for quantitation, and changes of several proteins were determined. For example, apolipoprotein C-I was shown to be elevated in all diseased groups.
Compared to animal models, especially eukaryotes, little information about the glycoproteins associated with cell differentiation and transformation is available for plants. The work of Elbers et al.  is one of the few studies on N-glycosylation and its potential roles in the adaptation of plant cells to environmental or physiological changes. More recently, Balen et al.  conducted a glycoproteomic profiling of the tissue grown in vitro from a succulent cactus plant, Mammillaria gracillis. Different tissues were separated by 2D-GE, transferred onto a nitrocellulose membrane, followed by detection of N-glycosylated proteins with lectin Con A affinoblot. The oligosaccharides from selected proteins were released by PNGase A and analysed by MALDI-TOF MS. The results obtained in this study indicated that the glycosylation profile of the same protein is highly dependent on the organization level of the plant tissue and can be correlated to specific morphogenic status.
Traditionally, analysis of glycoproteins has been a great challenge in proteomics due to the high complexity of the glycan structures and the presence of multiple glycoforms of the same protein. However, in the past few years, significant progress has been made in structural glycobiology, attributing to the advances in both highly efficient separation methodologies and sophisticated MS technologies. For example, combination of different lectins has been explored to isolate peptides and proteins with particular glycan structures. Chemical methods have also been developed to target specific functionalities on the glycan chains. With regard to MS technologies, the development and implementation of multiple complementary fragmentation techniques enable a more detailed view of glycosylation modifications.
Biological effects such as disease progression are usually associated with the changes in the level of protein expression as well as in the stoichiometry of glycosylation and glycosylation patterns. Therefore, it is essential to integrate quantitative capabilities into the routine analysis. The glycoproteomics research community has benefited from the development of quantitative approaches widely employed by the whole proteomics community, from the traditional gel visualization, to the popular MS-based isotopic labelling, and finally, the novel label-free methods boosted by the development of new algorithms and software.
The marriage of glycoprotein enrichment and quantitative MS analysis provides great opportunities for biomarker research. The comparative glycoproteomics approaches have found the most applications in cancer biomarker research, in part because that the tumour-secreted and the tissue-shed proteins in biological fluids are likely to be glycosylated. Owing to the nature of comparative glycoproteomics, in which glycoproteins of very low abundance can be significantly enriched and the complexity of the sample reduced, it has become a more attractive method for applications beyond cancer research, such as diagnosis for neurodegenerative diseases and infectious diseases.
Even with the advances in technology, structural complexity of glycoproteins remains a significant challenge. The dynamic ranges of detection employed by current technology still fall short for most biological samples. Accuracy and reproducibility of quantitation are critical issues to be addressed in method development. While comparative glycoproteomics offers a promising tool for biomarker discovery in complex biofluids, disease diagnosis only serves as the first step to understand the molecular mechanisms of the diseases. Highly specific and targeted proteomics approaches such as those targeted for glycosylations and phosphorylations will undoubtedly accelerate our pace to uncover the underlying mechanisms of various diseases and offer new insight into development of effective therapeutic strategies for these diseases.
This work was supported in part by National Institutes of Health through grant AI0272588 and the Wisconsin Alumni Research Foundation at the University of Wisconsin-Madison.
L.L. acknowledges an Alfred P. Sloan Research Fellowship.
Xin Wei is a graduate student in the Department of Chemistry, University of Wisconsin - Madison, WI, USA.
Lingjun Li is associate professor in the Department of Chemistry and School of Pharmacy, University of Wisconsin-Madison, WI, USA.