|Home | About | Journals | Submit | Contact Us | Français|
The enormous structural diversity of glycoconjugates mirrors their myriad biological functions in prokaryotic and eukaryotic systems. Various glycan molecules are known to participate in numerous general and specialized ways in virtually all regulatory pathways in microbes, fungi, plants, and mammalian systems. The many investigations into the genetic and cellular components of glycosylation processes are revealing their central importance and thereby are leading glycosciences on to the center stage of modern biomedical research.1,2 Recent short reviews capture the overall importance of glycosylation and glycan-protein interactions in mammalian cellular biology;3–6 the general importance of developing the glycosciences and its enabling tools is underscored in the 2012 report of the National Research Council to the U.S. National Academies.7 Since glycan biosynthesis is not directly subjected to a template-driven process, solid structural and quantitative analytical data concerning glycan types and their distribution are needed. Therefore, modern bioanalytical methods, technologies, and instrumentation, in particular mass spectrometry, have become increasingly important to solving the mysteries of glycoscience. As documented by the rapidly increasing numbers of published accounts, in part highlighted below, glycomics and glycoproteomics are now fast-growing fields of scientific endeavor.
This review focuses on the last 3 years of methodological and instrumental developments in analytical glycoscience with a particular focus on glycoproteins. Although not covered specifically in this article, many of these methods also have applications in the analysis of other important classes of glycoconjugates such as proteoglycans, glycolipids (including glycosphingolipids, GSLs), and polysaccharides. Processes to probe glycan–protein and sugar–sugar interactions through lectin and glycan microarrays also will not be discussed here.8,9 This review also builds upon prior comprehensive reviews of analytical approaches for the structural characterization of glycoproteins,10 including an in-depth description of mass-spectrometric techniques11 and other instrumental aspects.12–14
The extreme complexity and diversity of glycoprotein structures continues to demand new processes for their elucidation. To this end, mass spectrometry (MS) continues to be the central technique in the structural characterization of glycans and glycopeptides. Instrumental improvements and the availability of reliable commercial instrumentation to numerous laboratories have also driven new developments in terms of ionization and fragmentation techniques and of selective ion monitoring. The procurement of quantitative and not only qualitative data has advanced markedly due to the novel uses of isotopic labeling methods. Recent years have also shown an abundance of new applications for ion mobility/mass spectrometry (IM-MS) hybrid techniques to the problems of glycoprotein characterization.
Contemporary analytical glycoscience can now deal effectively with the analyses of isolated glycoproteins from relatively well-characterized biological sources (as seen, for example, in the analyses of biopharmaceuticals or affinity-isolated mammalian glycoproteins) where important analytes may be present at trace levels, but complex biological mixtures such as tissue extracts or biological fluids still pose challenges. Therefore, we first discuss recent advances in sample preparation, fractionation, and preconcentration procedures that can potentially overcome some of the long-standing problems in glycoanalysis. Current glycan release procedures are then reviewed with a particular emphasis on recent advances in O-glycan release processes. Chemical derivatization of such free glycans has traditionally improved detection limits and enhanced structural information from MS measurements and creative work in the design of new sample derivatization agents for glycans at microscale continues.
After a discussion of new spectrometry developments, the benefit to a number of selected application fields of liquid chromatography (LC) and capillary electrophoresis (CE), and their combination with MS, is highlighted. The primary goal of these techniques is to profile the glycan pools released from complex glycoproteins. The selected applications underscore the success of newly developed analytical methods and instrumentation.
Although this review emphasizes primarily the measurements performed in glycan pools and their structural characterization (glycomics), efforts to make connections between the protein structure and its glycan substitutions are clearly increasing; the long-expected and desirable merger of data from glycomics and glycoproteomics is really starting to happen. This is increasingly evident from the efforts to immobilize analyte glycoproteins on new types of sorption materials, followed by their enzymatic treatment, site-labeling, and MS investigations. New analytical platforms and procedural automation appear indicative of the maturation of the field toward dealing with large sample sets and comparative analyses. As these developments result in ever-increasing and highly complex data sets, new software developments and reports of computer-aided procedures are also the necessary consequence of new analytical developments.
Although the glycobiology community cannot yet readily probe the “protein–glycan interactome”6 and the spatial and temporal organization in biological cells and tissues, the recent pioneering investigations using MS imaging raise considerable hope for future studies of this kind.
With the enormous complexities of entire glycomes in different organisms, a universal reliable way of profiling all these important processes (equilibrium glycan concentrations, physiological functions, biosynthesis, disease conditions, etc.) will remain a distant dream even with the best available measurement techniques. Unfortunately, standardized protocols for sample preparation schemes have not even emerged yet and therefore comparison of data between different experiments to define “normal” versus “aberrant” glycosylation levels is still challenging. As different glycan recognition events take place in the intracellular space, in membranes, and in extracellular fluids, appropriately selected samples are of utmost importance. Ultimately, an understanding of the total glycome of any organism likely will depend first on the descriptions of different “metaglycomes”,6 such as those of particular cell types, tissues, breast milk, and other physiological fluids, which fortunately are now within the reach of the current measurement technologies.
A deeper view of any “metaglycome” may include up to several hundred glycan constituents, as is, for example, the case for free oligosaccharides and glycolipids in mammalian maternal milk,15,16 from the tentatively estimated thousands of oligosaccharides in the total glycome of an organism.17 Given the dynamic range limitations of current measurement techniques, including MS, reliable detection, and identification of minor mixture constituents necessarily involve sample fractionation and/or preconcentration prior to the final MS, LC, and CE measurements. Higher-level tandem MS (MSn) measurements are helpful in achieving greater structural information18 on different glycans, but the sample demands in these situations may not always be in tune with typical situations in biomedical and clinical research where minute volumes, small biopsy samples, and trace analyses are most often required. Demands for ever greater measurement sensitivities are very common in the practice of contemporary glycoscience. Less stringent demands on sensitivity are encountered in the rapidly developing field of glycoprotein-based biopharmaceuticals where sample availability is not usually a major issue. Consequently, as evidenced in the different applications areas detailed below, demands on a sample treatment strategy may differ substantially.
Glycoproteins of interest are characteristically contained in complex biological mixtures from which they must be purified and preconcentrated apart from other molecules including non-glycosylated proteins, lipids, nucleic acids, and small metabolite molecules. Numerous effective workflows have been developed for glycomic analyses and profiling from physiological fluids such as plasma, serum, and cerebrospinal fluid with typical volumes consumed in a procedure now at only low-microliter levels. The downside of these determinations is often a complicated sequence of purification steps and glycan release and derivatization procedures that must all be carefully controlled and executed to maintain the needed precision of analytical measurements. For example, when the concentrations of glycoproteins and potential biomarkers in blood serum or plasma span a respectable range of 12 orders of magnitude,19 extensive removal of major components such as albumin and IgG becomes essential. Additionally, hydrolytic procedures involving enzymes must first be optimized in terms of buffers and additives during the sample treatments, but the excess of salts, detergents, tagging reagents, and other reaction components must be minimized before the MS measurements. Different approaches to sample treatments are reflected in the currently available sets of methods and protocols20,21 but as the field develops, procedural simplifications are being continuously sought with dialysis, solvent extraction, microcolumn chromatographic purification, and solid-phase extraction methods.
Automation of the individual steps in glycomic analytical procedures can reduce volumetric errors and minimize sample losses, thus enhancing the overall reproducibility of the sample preparation protocol. Simultaneously, analytical throughput can be substantially enhanced.22–24 In experiments using a 96-well plate format to measure fucosylated N-glycans from the isolated haptoglobin serum samples,25 protein denaturation, deglycosylation, desialylation, and permethylation steps could be performed sequentially in a plate format prior to MALDI-MS profiling measurements. Additional sophisticated glycan derivatization schemes have notably been incorporated into the automated workflow platforms prior to matrix-assisted laser desorption ionization-time-of-flight-mass spectrometry (MALDI-TOF-MS),26 showing high repeatability. Shubhakar and co-workers have recently reviewed additional uses of automation and robotics for reproducible and routine glycomic analyses.24 While ancillary to the glycomic measurements themselves, successful automation efforts are rendering large-scale biomedical projects feasible with hundreds of samples reliably measured and quantitatively evaluated. Aided by instrument manufacturers, the robotics system were developed early for the LC/fluorescence detection runs25 and, similarly, in the multiplexed uses of CE at high throughput.26
Whereas the analyses of physiological fluids start with a homogeneous state for extraction, cellular and tissue samples present a different set of problems. In most laboratories practicing glycomic analyses today, tissue samples or cellular preparations are typically homogenized and extracted, disregarding the fact that different tissue regions can be quite heterogeneous in their cellular compositions and, consequently, their glycoconjugate content. However, the resulting analyses can still provide an initial overview of the sample. With the improvements in sample handling of histological tissues and instrument sensitivity, the situation is likely to improve soon as evidenced by efforts to spatially profile tissues by on-surface enzymatic digestions and microfiltrations followed by different MS off-line measurements.27–29 For example, a multienzyme workflow has been demonstrated in which N-glycans, glycosaminoglycans, and peptide profiles can sequentially be determined from tissue samples as small as 1.5 mm in diameter.29
Mass spectrometry imaging (MSI) has been an attractive area of research for more than a decade and eliminates some of the sample handling issues discussed above. With peptides and lipids as the usual analytical targets on different tissue surfaces (e,g., animal organ tissues or tumor biopsies), different ionization techniques, such as secondary ion mass spectrometry (SIMS), desorption electrospray ionization (DESI), and most commonly MALDI-MS, have been utilized with remarkable improvements in spatial resolution during the recent years.30 However, applications of MSI to glycans are more recent, a not surprising fact due to considerably lower ionization efficiencies for glycoconjugates when compared to the other biomolecules and thus the needs for highly sensitive detection. Formalinfixed and parafin-embedded tumor tissues have been particularly attractive targets of these investigations due to their availability in many tissue banks for future histological studies. While histochemical staining procedures using lectins (and, to a lesser degree, antibodies) point to the presence or absence of selected glycan subgroups, the newly developed MSI tools31–33 can reliably pinpoint specific N-glycan structures. These pioneering studies in MSI indicate very promising paths toward a better understanding of the roles glycans play in cellular physiology and pathological processes. The overall procedures involve tissue sectioning, deparaffinization, surface rehydration, denaturation of proteins, and treatment with N-glycosidases, all on a suitable planar surface.33 Subsequently, tissues are overcoated with a MALDI matrix and sequentially analyzed through MS. As with many other MS-based glycomic procedures, terminally located sialic acids present difficulties due to their labile nature. In a most recent publication,34 this problem was substantially solved through a linkage-specific sialic acid derivatization, discussed in more detail below. The added advantage of this procedure is that the biologically relevant α-2,3 and α-2,6 linkages can be elucidated at the level of different tissue regions.
Although many laboratories will likely continue performing bulk extractions and analyses of cellular materials for some time in the future, a cautionary note is in order. Different sample processing and cleanup procedures can clearly lead to very different analytical conclusions. In a comparative study conducted under the auspices of the Human Proteome Organization (HUPO), 14 leading laboratories were provided with lyophilized cell pellets of three cancer cell lines35 and asked to report their analytical findings. Each laboratory processed these cell lysates according to their preferred protocols, in which the levels of sialic acids were particularly emphasized. Unfortunately, there was little agreement in both the glycan and glycopeptide levels reported by the groups who had previously shown good collective results in two similar pilot studies on purified glycoproteins. This extremely valuable exercise highlights the importance of developing more refined and definitive procedures for cellular materials.
Recovery of glycoproteins from biological fluids and complex cellular extracts is the first essential step toward reliable analyses at the level of glycans or glycopeptides. The removal of small molecules such as lipids and electrolytes as potentially interfering materials is also recommended; the additional removal of nonglycosylated proteins is often also beneficial as a preliminary step. At this point, affinity-based separations can be most effectively employed for both sample enrichment and major protein depletion, as currently recognized by most glycoproteomic platforms.10 For example, commercial depletion columns are available for removing albumin and other major proteins from serum and plasma samples. At the level of glycoprotein mixtures, specialized capture materials featuring antibodies, immunoadsorbents, or immobilized lectins can all be employed to significantly reduce mixture complexity. For example, quantification of a priori selected glycoproteins can be facilitated through the use of available antibodies, as is shown with the antibody-assisted capture of α-1-acid glycoprotein (AGP) from a 5-μL aliquot of human blood serum (Figure 1) and its fidelity in revealing the expected N-glycans.12 More recently, an antihaptoglobin antibody immobilized to a hydra-zide resin was successfully used in a preconcentration column for a simple, high-purity isolation of serum haptoglobin, prior to determination of the bifucosylated glycans that distinguish hepatocellular cancer from liver cirrhosis.23 In a similar fashion, immunoglobulins and their associated glycans can be probed in different states of health after their sequential purification through immobilized Protein G and Protein L microscale affinity columns.36
Surface-bound lectins continue to be utilized and further explored as a means of glycoprotein fractionation and as selective reagents for lectin microarrays.8 Among the primary reasons for the use of lectins in glycoanalysis is their functional diversity and applicability due to different carbohydrate-binding motifs.37 With the many easily obtainable or commercially available lectins38 and an increasing number of recombinant lectin variants, the interest of glycoscience community in their use is likely to continue. However, the structural diversity of lectins can create certain problems with efficient surface immobilization. Whereas the immobilization of lectins is typically performed on agarose gels or resin materials, more rigid silica-based supports are desirable in the analytical platforms involving pressurized systems. For example, microporous silica particles39 utilizing mannose-binding concanavalin A (ConA) and fucose-binding Aleuria aurantia lectins (AAL) could isolate different glycoprotein pools from a mere 1-μL volume of blood serum. High binding capacities for glycoproteins were noted using this macroporous material. Whereas many lectin-based sample preconcentration protocols appear effective, the absolute selectivities for a class of carbohydrates cannot always be guaranteed. A fairly exhaustive review of the uses of lectin affinity chromatography in glycoanalysis prior to 2013 is available,11 and more recently a comprehensive database of antiglycan reagents including lectins and antibodies has been established to guide experiments.40 Although lectins and other carbohydrate-binding proteins have preferred binding partners, their binding affinities, especially with complex biological samples, can be broader than is often appreciated.
The last several years have also seen significant efforts to develop specialty resin surfaces to capture glycoproteins from complex biological mixtures prior to MS analyses. The widely used and time-honored hydrazide chemistry solid-phase extraction resin41 with isotope labeling and its numerous modifications (recently reviewed10) are documented in the literature for the identification of N-glycosites. More recent efforts have focused on the procedures where either the glycans alone or both N-linked glycans and glycosite-containing peptides can be quantitatively assessed. In 2013, Yang and co-workers42 reported a procedure for immobilization of glycoproteins using reductive amination of N-termini and lysine residues. Following an extensive wash to remove unconjugated molecules as well as the residual reagents, glycans are released through the usual cleavage procedures: enzyme-based release of N-glycans and ammonium hydroxide-based O-glycan release. Alternatively, sialic residues could be protected, while still on the resin, through derivatization. The overall procedure was tested through MALDI-MS of human serum N-glycans and O-glycans from porcine mucin.
Comprehensive glycoprotein analysis from solid-phase media have spawned elaborate platforms such as solid-phase extraction of N-linked glycans and glycosite-containing peptides (NGAG)43 and solid-phase reversible sample-prep (SRS).44 The overall NGAG procedure with the attachment to the resin involves seven steps, in which the properly modified glycopeptides are attached to an aldehyde-functionalized solid support through reductive amination, while carboxyl groups, C-termini, and sialic acids are all protected by aniline treatment. Following the enzymatic cleavage of N-glycans from the matrix, all glycosite-containing peptides are also released for MS analysis and extensive data processing. The described SRS platform takes advantage of a uniquely functionalized silica material and its reversible adsorptive nature. The overall platform (Figure 2)44 starts with a noncovalent attachment step, a washing step, and an enzymatic release of N-glycans in the presence of H218O, which is involved in labeling the original N-glycosites. The subsequent protease digestion yields a mixture of peptides, including those with isotopically labeled glycosites. The mixtures of N-glycans and peptides are separately analyzed by appropriate analytical techniques. Development of comprehensive platforms, particularly those that can be easily automated, appears to be a major step toward bridging the frequent and unfortunate gap between the fields of glycomics and glycoproteomics.
Mixtures of glycoproteins are frequently hydrolyzed by protease enzymes to yield complex mixtures of glycosylated and nonglycosylated peptides. Different MS techniques are ultimately used to identify glycosylation sites or, alternatively, further released glycans. However, these measurements necessitate enrichment of glycopeptides to remove them from interfering nonglycosylated peptides and thus decrease mixture complexity. The glycopeptide enrichments seen in the recent literature vary widely in terms of preconcentration principles, formats (microcolumns filled with beads, monolithic materials, cottonfilled pipets), and surface-attachment chemistries. N-Glycopeptides have captured more attention to date than O-glycosylated structures, and solid-phase extraction (SPE) has been increasingly utilized. Beyond the now well-established hydrazide surface attachment of glycan moieties,41 lectins have still been utilized45,46 in recent studies, although most lectins seem more effective at the level of intact glycoproteins rather than glycopeptides. The interaction between boronic acid-based structures and glycan diols is increasingly explored47,48 for the benefits of glycopeptide enrichment. As an example, Chen and co-workers48 were able to use this principle as a universal means to map comprehensively the yeast N-glycoproteome and characterize its 332 glycoproteins. In combining the boronic acid-based SPE with peptide-N-glycosidase F (PNGase F) treatment in heavy-oxygen water, these authors were able to identify the relevant N-glycosylation sites through MS.
The design of polar SPE beads with desirable properties for capturing glycopeptides has been a popular trend during the recent period as well. New ferromagnetic nanoparticles with L-cystein-bonded zwitterions were synthesized49 and briefly tested with horseradish peroxidase tryptic digest. In another study, a specialty copolymer reminiscent of hydrophilic interaction liquid chromatography (HILIC) supports was prepared, physically characterized and tested with IgG digests.50 Toward a rational design of an ideal enrichment of glycopeptides, Qing and co-workers51 have recently described interesting dipeptide-based homopolymers and showed some preliminary results. These newly reported materials still await testing against a multitude of commercially available HILIC materials for comparison.
Typically, the first steps in a glycomics workflow consist of protein denaturation followed by the specific, regulated release of N-glycans or O-glycans. The most popular method to release N-glycans from mammalian glycoproteins still involves the use of PNGase F, now available from many commercial sources. PNGase F specifically lyses unsubstituted and core 6-fucosylated N-glycans from peptides and proteins. The reaction conditions are mild; the structural integrity of glycans, peptides, and their substitutions are retained. The products are ammonia, aspartic acid (in the remaining peptide chain), and the intact oligosaccharides in their nonreduced form. Released glycans are easily labeled with a fluorescent tag by reductive amination or other labels of choice to allow fluorescence detection and/or increased ionization in MS. However, PNGase F is expensive, does not cleave core-3 fucosylated glycans that are often present in plants and invertebrates, and is inefficient at cleaving N-glycans at the N-termini of peptides. Several recent reports have addressed both these shortcomings by resorting to chemical release of glycans,52–55 optimization of PNGase F release of N-terminal glycans,56 immobilization of PNGase F for reuse,57,58 and the discovery and application of novel broad substrate-specific N-glycosidases.59,60 In addition, several innovative high-throughput methods for the combined glycomics and glycoproteomics release, labeling, and enrichment have been presented.
In two recent papers, Song and co-workers have tested different modes for the large scale release of N-glycans55 exclusively, or N-glycans and O-glycans, from proteins and glycan nitriles from GSLs found in cells, tissues, and organs.52 In the first paper, a novel strategy to release and tag N-glycans called “threshing and trimming” (TaT) was reported. Pronase (“threshing”) protease treatment of glycoproteins, tissues, or organs first generates a pool of N-glycopeptides only one or a few amino acids long. Then, “trimming” with N-bromosuccinimide (NBS) under mild conditions leads to oxidative decarboxylation and generates aglycon moieties as free reducing glycans, nitriles, or aldehydes, depending on the reaction conditions. The nitriles are then tagged with 2-aminobenzamide (2-AB), the aldehydes with the bifunctional fluorescent linker 2-amino-N-(2-aminoethyl)benzamide (AEAB), and the reaction products are detected with MALDI-TOF-MS or HPLC with fluorescence detection. The TaT protocol is reportedly specific to N-glycan release; no O-glycans were detected. The mild reaction conditions should leave labile groups such as sialic acids, phosphates, and sulfates intact. One disadvantage with the TaT protocol is that the Pronase digestion takes as long as 48 h, similar to protocols that require overnight digests but in stark contrast to some optimized 5 min PNGase F protocols.61
More recently, Song and co-workers wielded household bleach for the oxidative release of all classes of glycans: N-glycans, O-glycans, and glycans from GSLs.52 Samples were treated with sodium hypochlorite (NaClO), the active component in household bleach, which seems to degrade proteins but leave N-, O- and GSL-associated glycans intact. The liberated glycans are tagged with different fluorescent tags. This “brute force” strategy enabled the authors to release gram quantities of glycans. Although the paper is quite comprehensive, verification by other groups using a wide range of samples will help delineate the scope of this approach for analytical rather than preparative-scale sample preparations going forward. In contrast to the authors’ previous TaT release protocol and traditional PNGase F (N-glycans)/elimination (O-glycans) release methods, the sequential release of N- and O-glycans is very difficult to control with bleach. However, what it lacks in selectivity compared to the Pronase/NBS protocol, the bleach process gains in speed, with release of N-glycans within minutes. Interestingly, the derivatization of the liberated N-glycans is facile, but O-glycans and glycans from GSLs are not released in the free reducing form. Clearly more mechanistic work on this bleach-mediated process is needed. The proteins are degraded, while the N-glycans are released as glycosylamines that then spontaneously convert to nonreduced free N-glycans. Some loss of reducing-end N-acetylglucosamine (GlcNAc, 20%) was observed, and the levels were dependent on time and temperature. Initially, the method was demonstrated to work on ovalbumin, bovine IgG, and horseradish peroxidase (HRP) and is thus not inhibited by core 3-fucosylation. The free glycans could be derivatized through reductive amination. After successful liberation of N-glycans from human saliva, several hundred grams of egg yolk and porcine tissues were treated with hypochlorite and 0.5–1% wet weight was recovered as free N-glycans. In addition, a library containing large quantities of 67 complex N-glycans was used to print a glycan microarray.
By combining a chemical deglycosylation method that does not wholly cleave N-glycans with LC–MS-based glycoproteomics, Chen and co-workers could perform a comprehensive analysis of the N-glycosylation sites in yeast.53 Lectin-enriched glycopeptides from yeast extract were treated with a mix of trifluoromethanesulfonic acid (TFMS) and toluene, which specifically cleaves N-glycans from the peptide while leaving the core GlcNAc amide bond intact, thus creating truncated N-glycopeptides. The treated peptides were analyzed with LC–MS and 555 N-glycosylation sites were discovered, of which 184 belonged to membrane proteins. This process was shown to be more efficient than Endo H treatment. Another recent study which used 18O-labeling of the N-glycan sites in conjunction with PNGase F release of boronic acid-enriched glycopeptides found even more N-glycosylation sites, though, totally 816 N-glycan sites in 332 glycoproteins.48 Yuan and co-workers54 used 0.5 M NaOH to release N-glycans from model glycoproteins and plasma and claim that no peeling (degradation of the glycan) or loss of sialic acid occur. O-Glycans were detected to some extent but mostly as degraded components due to peeling. The scope of a method that uses such high pH remains to be shown for analytical sample preparation, given that such alkaline conditions can be detrimental to labile compounds.
In contrast to N-glycans, no enzyme is known that completely cleaves all O-glycans from a polypeptide chain. Traditionally, chemical release with elimination at high pH has been used which degrades the protein/peptide completely during the reaction and also the labile substitutions such as O-acetylation.62 A reducing agent such as sodium borohydride can be added to minimize peeling of the glycans.62 However, this reaction generates reduced O-glycan alditols, which cannot be readily derivatized and have lost some stereochemical information. One of the most efficient protocols for O-glycan release (up to 100-fold increase in sensitivity compared to normal elimination protocols) remains the use of extensive protease digestion of glycoproteins down to a single amino acid level, so that O-glycans are released in the ensuing solid-phase permethylation step.63
Recently, some protocols for a nonreductive O-glycan release process were published using ammonium carbamate64 or hydrazinolysis65–67 and, as discussed above for N-glycans, bleach.52 O-Glycan release with NaClO requires high concentrations and longer incubation times compared to N-glycan release. While the polypeptides are degraded, the O-glycans are detected in three forms: nonreduced glycans and O-glycans with the O-glycosidic linkage bond to Ser/Thr in the form of a glycolylic acid or lactic acid (O-glycan acids). The O-glycan acid can be derivatized with activation using 1-ethyl-3-(3-(dimethylamino)propyl)carbodiimide hydrochloride (EDC) and N-hydroxysuccinimide (NHS) and labeled with a fluorescent mono-Fmoc-(fluorenylmethoxycarbonyl)-modified ethylenediamine. Sulfation was reported to be a common modification of the O-glycans released from PSM (porcine submaxillary mucin). Piperidine deprotection allowed the derivatized glycans to be printed on a glycan array. To show the applicability on a larger scale, O-glycans were released from mouse stomach, small intestine and colorectal samples. The NaClO treatment generated O-glycopeptides (rather than O-glycan acids), which were subsequently purified with C18 and carbon SPE columns, followed by permethylation. While recent years have seen much progress, a universally reliable method to release nonreduced O-glycans still awaits some creative work ahead.
Gizaw and co-workers did a comprehensive N-, O-, and GSL-glycan profiling of Huntington’s disease transgenic mice brain and sera utilizing a glycoblotting high-throughput technique.64 O-Glycans were removed with ammonium carbamate at 60 °C for 40 h, followed by washing steps and chemoselective capture on BlotGlyco H hydrazide beads and subsequent derivatizations. The original protocol was developed previously and the authors claim that no significant loss due to peeling was observed.68
Hydrazine releases O-glycans with a free reducing end, but the mechanism is not known in detail. A recent study compared the efficiency of ammonia-based and hydrazine-based release of O-glycans from two recombinant proteins with 4 and 13 potential O-linked glycan sites, respectively.67 Hydrazinolysis was shown to be 20–30 times more efficient and displays no bias, thereby rendering the process suitable for quantitative nonselective release. Although hydrazinolysis appears to be very efficient, the method requires special chemical handling procedures and may not be suitable for broader use. A disadvantage shared by elimination, ammonium salt, and hydrazinolysis is the loss of labile substitutions such as O-acetylation. It remains to be shown if bleach also has the same effect.
Modern LC–MS based glycoproteomics may not be dependent on enzymatic or chemical release in order to identify the structure of the glycans and their site occupancy. Ideally, sequential tandem MS of glycopeptides with different fragmentation techniques could identify glycoproteins from the information on both the glycan structure and peptide sequence in a single LC–MS run as discussed below.
Glycosphingolipids form a group of lipids that possess a carbohydrate headgroup consisting of mono- or oligosaccharides attached to the lipid sphingosine or ceramide. A release of the glycan is usually performed with endoglycoceramidase II enzyme, which is commercially available. One drawback is that this enzyme is not completely nonselective. Recently, a paper was published that provided the protocol for high-throughput analysis of GSLs from mammalian cell surface and serum by releasing the glycan moiety with a novel recombinant endoglycoceramidase (EGCas I) with broad GSL specificity.69 Released glycans were derivatized with a fluorescent tag and analyzed through LC. The workflow builds upon previous development by the same group of a high-throughput, automated N-glycan sample preparation platform for glycoprofiling of glycoproteins.70 Since then, the group reported even higher throughput methods.71 Chemical release of the glycan moiety is also possible. Song et al. tested bleach release on both gangliosides and brain tissue in aqueous conditions.52 In both cases, the glycans were liberated and observed as nitriles. For pure gangliosides, NBS treatment was enough to release glycan nitriles. The released nitriles were permethylated or labeled with tags such as AB in a multistep fashion. Gizaw et al. released the glycan headgroup from GSLs through ozonolysis and alkaline degradation as part of comprehensive glycoblotting high-throughput studies of N-, O-, and GSLs from human and mouse tissues.64,72 The ozonolysis is only applicable to sphingosine containing a C═C bond.
Glycans are challenging biomolecules to study at low analyte levels through common analytical methods without some type of derivatizaton. Since most carbohydrates do not possess any chromophores, spectroscopic labels must be introduced through derivatization, usually at the reducing end. Additionally, glycans do not ionize well and some, particularly sialic acid and fucose residues, may easily be degraded during MS experiments. Derivatization of the glycan may both stabilize the molecule and also improve its ionization potential. The common permethylation process certainly serves both of these functions. During the last several years, the incorporation of heavy and light isotopes has been used to assist in quantification of glycans through MS. New isobaric tags are being introduced in similar ways to their previous uses in quantitative proteomics.
Permethylation of glycans ideally leads to the complete conversion of all free hydroxyl groups to methyl ethers, esterification of sialic acids, and other carboxylic acid-containing moieties, and the addition of a methyl group at the nitrogen in the N-acetyl groups of sugars such as N-acetylgalactosamine (GalNAc), GlcNAc, and sialic acid.14 The key advantages of permethylation include (a) improved ionization compared to native glycans, (b) neutralization of sialic acids, which enables acquisition of complete glycan profiles from complex samples with MS in the positive ion mode, (c) more prevalent cross-ring fragments to be obtained, yielding linkage information with MSn,18,73 and (d) increased hydrophobicity of permethylated glycans to allow the use of standard C18 reversed-phase LC–MS. Although protocols for permethylation have been perfected over time to reduce artifacts and side reactions, complete permethylation of large glycans is still challenging.74 Whereas LC columns packed with porous graphitized carbon (PGC) may separate native isomeric glycans through high column selectivity, this separation is usually challenging for permethylated glycans analyzed with C18 LC–MS. However, a recent study claims that isomeric separation of permethylated glycans is possible with PGC (porous graphitized carbon) LC–MS at elevated temperatures, allowing the hydrophobic solutes to elute from the carbon column.75,76 Using the standard protocol of permethylation, sulfated and phosphorylated glycans are usually not retained. Several adaptations have been put forward to allow these glycans to be enriched and detected in their permethylated form.77,78 The permethylation protocol has traditionally been labor-intensive, but recently a highly automated workflow suitable for high-throughput was reported.79 With the use of robotics, the glycan profiles of monoclonal antibodies and recombinant human erythropoietin were identified and quantified.
Stabilization and neutralization of sialic acids are often needed to analyze acidic glycans for MALDI-MS analyses. Fortunately, nonspecific methylamidation of all linkage forms of terminal sialic acids can drastically improve the chromatographic isomeric separation of multisialylated glycans with PGC columns.80 In addition, glycopeptide treatment with methylamine results in methylamidation of not only sialic acids but also the peptide (Asp, Glu, and C-terminus). These modifications increase the ionization efficiency and facilitate the annotation of glycosites.81,82 Isotopic labeling with “heavy” and “light” methylamidation allows quantitative comparison of different glycopeptide fractions.81 Solid-phase methylamidation has also been demonstrated to reduce sample losses and contamination during the derivatization process.83
The linkage-specific derivatization of terminal Siaα2,3 versus Siaα2,6 moieties is known to yield isomeric separation based on mass differences.84 DMT-MM (4-(4,6-dimethoxy-1,3,5-triazin-2-yl)-4-methylmorpholinium chloride) dissolved in methanol can convert Siaα2,6 to methyl esters whereas Siaα2,3 residues are converted to cyclic internal esters (lactones).84 Now a modification of this protocol with a subsequent permethylation step has also been introduced.85 DMT-MM dissolved in ammonium chloride selectively amidates Siaα2,6 while Siaα2,3 spontaneously form cyclic lactones prior to subsequent permethylation. Recently, several papers have been published with protocols suitable for high-throughput glycan profiling which include linkage-specific derivatization of terminal sialic acids by the Wuhrer group.25,34,86,87 In one paper, a derivatization protocol was established using a combination of carboxylic acid activators in ethanol to achieve highly efficient ethyl esterification of α2,6-linked sialic acids and lactonization of α2,3-linked variants, with mild conditions.86 The protocol was demonstrated with the N-glycan profile of human plasma with MALDI-MS with the display of a large number of linkage-specific glycans. In the second study, the same linkage-specific derivatization was included in a highly automated setup with robotic ethyl esterification, HILIC-SPE, and MALDI spotting with 96- and 384 well sample plates.25 Throughput time for the first plate was 2.5 h for the first 96 samples and 1 h for each additional plate. The same group also expanded their work to sialic acid linkage-specific stabilization of IgG glycopeptides.87 The protocol was slightly changed and included a linkage-specific dimethylamidation instead of the use of alcohols, which was specific for the sialic acid linkages and the carboxylic acid groups of the peptide portion. In their last study, Holst and co-workers employed linkage-specific in situ sialic acid derivatization for N-glycan mass spectrometry imaging (MSI) of formalin-fixed parafin-embedded tissues.34 The protocol was slightly tweaked to include a second Siaα2,3 linkage specific amidation step with ammonium hydroxide after dimethylamidation (lactonization). This second step was needed to stabilize the Siaα2,3 specific derivatization, which would otherwise be prone to hydrolysis during the in situ PNGase F treatment. Solid-phase, two-step derivatization has also been demonstrated on glycopeptides from transferrin, fetuin, IgG, and serum by Li and co-workers.88 Briefly, after derivatization, Siaα2,6 formed ethyl esters (+28 Da) and Siaα2,3 resulted in methyl amides (+13 Da). Interestingly, the neutralization of sialic acids also greatly improves the isomeric resolution of derivatized glycans in microchip electrophoresis.89
Derivatization of glycoconjugates with a fluorophore serves a dual purpose: enhancing sensitivity of analysis with different detectors together with an increase in hydrophobicity of the highly hydrophilic sugars that can increase chromatographic retention in the reversed-phase LC mode.90 Some of the most commonly used fluorescence reagents for glycan analysis include 2-aminobenzamide (2-AB), anthranilic acid (2-AA) 2-aminopyridine (2-AP), and 1-phenyl-3-methyl-5-pyrazolone (PMP).90 2-AA-labeled glycans are most often detected with fluorescence, but some reports analyze 2-AA labeled N-glycans with reverse phase LC–MS.91 Whereas the previously reported literature indicates negligible desialylation during 2-AB labeling of N-linked glycans, a recent report indicates that there is some loss of sialic acid from highly sialylated glycans during 2-AB derivatization or possibly other reductive amination procedures as well.92 Recently, 1,3-di(2-pyridyl)-1,3-propanedione (DPPD) was used to place a 2-pyridylfuran (2-PF) fluorescent tag on monosaccharides that were analyzed using high-performance liquid chromatography (HPLC).90 With this 2-PF tag, subfemtomole levels of detection were achieved.90 However, since the C2 stereocenter is lost during the derivatization process, C2 epimers, such as D-glucose versus D-mannose, are indistinguishable by HPLC analysis.90 Note that in capillary electrophoresis (CE) with laser-induced fluorescence (CE-LIF), the label serves the same purpose as in HPLC but here other labels are preferred, such as 8-aminopyrene-1,3,6-trisulfonic acid (APTS) which provides a triply negative charge.93
Although some fluorescent labels improve ionization in MS analysis, there have been several attempts to improve this effect further. Lauber and co-workers reported a new improved label, exclusively available in a commercial kit, where the N-glycans are released and tagged with a RapiFluorMS (RFMS) label in 30 min.61 The label is derivatized with glycosylamine at the reducing end immediately after PNGase F treatment. The label contains a quinone fluorophore as well as a tertiary amine for strong positive mode ionization. The label is compatible with standard HILIC-LC–MS protocols. The authors demonstrate 14× higher fluorescence and 160× stronger MS signal over 2-AB labeled glycans.61
A common problem in LC–MS tandem MS fragmentation of native or reductively labeled glycans is fucose migration from the antenna to the core or vice versa, thereby making the annotation ambiguous. Reducing-end labeling with procainamide hydrochloride has been suggested as a solution.94 LC–MS analysis showed that the ionization was improved 10–50 times compared to 2-AB label and unambiguous diagnostic ions for core fucosylation were observed.94
O-Glycan labeling has traditionally been more challenging than N-glycan labeling, since chemical release via β-elimination results in O-glycans in a reduced form that is no longer easily derivatized. Previous studies have found that β-elimination performed in the presence of 1-phenyl-3-methyl-5-pyrazolone (PMP) allows for a one-pot simultaneous release and labeling of O-glycans.95 The resulting PMP-labeled O-glycans may be analyzed with HPLC, CE, or MS.95
Labeling strategies for O-GlcNAc sites in glycopeptides using chemical modifications, chemoenzymatic modifications, and metabolic labeling have recently been disclosed, thereby enabling the enrichment and easier detection with LC–MS to dramatically increase the number of O-GlcNAc sites discovered.96,97 Boysen and co-workers have used a novel MS strategy to map the O-glycosylation in E. coli. The O-GlcNAc moieties on peptides were replaced with an affinity tag that enabled their enrichment with affinity columns prior to LC–MS analysis. This was achieved with β-elimination of O-linked glycans followed by Michael-addition of a 2-aminoethyl phosphonic acid (AEP) group and subsequent selective enrichment of tagged peptides on titanium dioxide columns. The authors report that LC–MS on the tagged peptides identified an impressive 618 O-GlcNAc sites while only four had been published previously.96 By contrast, Grifin and co-workers labeled the O-GlcNAc with chemoenzymatic attachment of an azide-containing monosaccharide onto O-GlcNAc proteins.97 The unnatural monosaccharide was further modified with a linker to facilitate enrichment prior to LC–MS analysis. The protocol was employed on model proteins with known O-GlcNAc profiles for comparison.97 An approach to monitor the O-glycome of living cells, termed cellular O-glycome reporter/amplification (CORA) using a metabolic labeling strategy, was reported by Kudelka and co-workers.98 All mucin type O-glycans start with GalNAc-α1-O-Ser or -Thr. By making a chemical analogue, Bn-α-GalNAc, that structurally mimics this precursor, the analogue is accepted as glycosyltransferase substrate. By introducing Bn-α-GalNAc in the cell culture media, Bn-α-GalNAc is transported into the secretory pathway, modified by glycosyltransferases and secreted into the medium as biosynthetic Bn-O-glycans that could be easily purified, permethylated and analyzed by MS.98 Although such metabolic labeling strategies always have the caveat of potentially altering natural cellular processes and are limited to artificial systems for sample production, the authors claim that this method resulted in an ~100–1 000-fold increase in sensitivity compared to the conventional O-glycan release and identified a very complex repertoire of O-glycans in several cell types from human and mouse.
Recent instrumental and analytical advances notwithstanding, many methodologies in glycan analysis are still far from being routine and reliably reproducible. Standardized reporting guidelines for all steps in a glycomic analysis are therefore a crucial step toward critical evaluation, dissemination of data sets, and comparison of results obtained in different laboratories. To this end, the minimum information required for a glycomic experiment (MIRAGE) initiative was established in 2011, thus far providing guidelines for data reporting of mass spectrometry, liquid chromatography, sample preparation, and data handling.99–101 The MIRAGE project has been a concerted effort to stimulate the wider scientific community to improve experimental protocols and reach reproducible data sets.
As no single method seems to determine quantitatively all structural features of glycans, no particular “gold standard” method yet exists. The most common form of glycan quantification is the use of hydrophilic interaction (HILIC) HPLC with fluorescence detection of reductively aminated glycans with fluorescent labels.102 This method is extensively used in the pharmaceutical industry and is not difficult to validate under GMP regulations.102 Relative and absolute quantification can also be obtained using MS. Several quantitative techniques developed originally for proteomics are now being used in the glycomics field as well. A popular quantitative method is the use of chemical labeling with an isotope tag that does not affect chromatographic separation or ionization in LC–MS while providing an isotopic mass shift to differentiate the glycans. “Light” and “heavy” isotope-labeled glycans are mixed 1:1, while the corresponding MS peak height determines relative quantity. The isotope tag may be incorporated chemically, such as during permethylation of N- and O-linked glycans (CD3I or 13CH3I vs 12CH3I)103–105 and reductive amination of N-glycans106–108 or enzymatically, such as during PNGase F release of N-glycans.109,110 A variant of isotope labeling is the use of isobaric labels (13CH3I or 12CH2DI) which yields the same nominal mass in low-resolution MS but can be differentiated in high-resolution accurate mass MS.103 This approach has several advantages compared to the isotopic labeling during permethylation. In the original isotopic permethylation protocol, the mass difference between the heavy and light form of each glycan is variable, since it is proportional to the number of methylation sites. This makes interpretation of the data complex. However, the combination of high- and low-resolution MS and MS/MS makes differentiation of some isomeric glycans possible through isobaric labeling.103 Isobaric labeling at the reducing end has also been reported for glycan quantification.111
Multiplexing with isobaric tags is often referred to as tandem mass tags (TMTs). While common in proteomics, these tags have been less employed in glycomics. Isotopic mass shift labeling relies on LC–MS for quantification, while isobaric quantification with TMTs relies on the reporter ions in MS2 and MS3. Yang and co-workers presented a novel isobaric tag, Quaternary Amine Containing Isobaric Tag for Glycan (QUANTITY), which can completely label glycans and generate strong reporter ions.111 Up to four different samples can be labeled through reductive amination and analyzed simultaneously for the relative quantification of glycans. A carbonyl reactive aminoxyTMT was recently made commercially available to allow 6-multiplexing of labeled glycans. Zhou and co-workers recently successfully used the aminoxyTMT tags to compare the relative quantity of the serum glycan profile of patients with esophageal diseases.112 The labeling was fast and efficient, but sodium adducts were needed to achieve efficient MS/MS of highly branched N-glycans.112
PNGase F release of N-glycans in the presence of H218O will yield a 2 Da mass shift of the former N-glycosylation site and the released N-glycan compared to products achieved in normal H216O. These type of labels are easy to use, have no side reactions, and provide a good linear response. However, the interpretation of data may be complicated, and therefore software tools to overcome these hurdles have been introduced. Mixing glycans released in the presence of “heavy” or “light” isotopically labeled water and subsequent 1:1 mixing of the two fractions prior to MS analysis allows for relative quantification of different samples from standard glycoproteins or serum.109 Double isotope labeling of both the peptides and the glycans via sequential enzymatic digest with PNGase F and trypsin in the presence or absence of heavy H218O isotopes has also been introduced.110 A reductive step with NaBH4 or NaBD4 to convert the aldehyde to an alcohol was added after PNGase treatment in the presence of H218O, which prevents exchange between 18O and 16O in the used water.113 In the reductive process, one additional hydrogen atom is added to a reduced glycan and an additional isotope difference is thus introduced in the form of H+ or D+.113
Other quantification strategies involve isotope tagged reducing end labels such as 13C6-2-AA.106–108 Additional strategies include the use of the reducing-end label Girard’s Reagent P in deuterated or nondeuterated form114 or reductive amination using 12C6-aniline and 13C6-aniline as isotope-coded labeling reagents.115,116
A combination of metabolic labeling with sialic acid and GalNAc analogues, which enables glycopeptide enrichment and isotope tagging, was recently published (IsoTag).117–119 As a first step, LC–MS analysis of labeled glycopeptides is performed to identify which peptides carry the isotope label and an inclusion list is generated that is further used in a second targeted LC–MS/MS analysis. This second step subsequently fragments only the labeled glycopeptides from that list.
Recently, a monoclonal antibody with isotopically labeled glycans was introduced, and its feasibility as an internal standard in different glycomic analyses with LC–MS has been demonstrated.120 Absolute quantification can also be facilitated by using an internal “unnatural” glycan standard which is used to spike all samples which are analyzed with MS, equally facilitating normalization of the MS signal in different analytical experiments.72
Kim and co-workers recently introduced a method for quantifying N-linked glycoproteins containing cysteines, termed isotope-coded carbamidomethylation (iCCM).45 The treatment with iodoacetamide (IAA) or its isotope IAA-13C2,D2, prior to tryptic treatment results in carbamidomethylated Cys residues and peptides with 4 Da differences. N-Glycopeptides were enriched with lectins in an online Microbore Hollow Fiber Enzyme Reactor and detected with LC–MS (mHFER-nLC-MS/MS). Standard proteins and liver cancer sera were then analyzed utilizing quantitative MS, multiple reaction monitoring (MRM), a technique discussed in more detail in the next section below.
In the area of O-linked quantification, metabolic labeling introducing sugar-nucleotide analogues (facilitating purification, detection, and quantification) has also been reported.117,118 O-GlcNAc analogues can be used to specifically enrich O-GlcNAcylated peptides and identify them with mass spectrometry.121 Recently, a protocol using stable-isotope coded labeling by zero or five deuterium atoms (D0/D5) with 1-phenyl-3-methyl-5-pyrazolidone (PMP) was introduced.122,123 This protocol allows for one-pot simultaneous release and labeling of O-glycans and assists the relative quantitative comparison between “heavy” and “light” labeled fractions.
Finally, glycans and glycopeptides can also be quantified with multiple reaction monitoring, MRM. In MRM, as discussed further below, the concentration of the unknown sample is determined by comparing its MS response to that of a known standard.
The fundamental aspects of ESI and MALDI mass spectrometry and the rapidly growing applications in glycomics and glycoproteomics have been reviewed elsewhere;11–13 therefore, this section will focus on recent technical innovations in sample ionization and detection. A general overview of various fragmentations of N-glycopeptides, and the information that is available from each cleavage pattern is presented in Figure 3.
Although the monosaccharide constituents and their linkage patterns of the resulting fragment ions are assumed based on prior knowledge of biosynthetic pathways in mammals, an analysis of N- and O-glycans from less well understood sources cannot make such presumptions. The entire monosaccharide subset that matches the observed molecular weight as well as any possible linkage must be considered. This de novo sequencing remains extremely challenging because of the structural similarities between monosaccharides and the consideration of the commonly used MS-based analytical techniques as achiral. However, through a variant of the Cooks’ fixed ligand kinetic method, which relies on the measurement of the dissociation rates of noncovalent gas-phase complexes that consist of metal-amino acid-monosaccharide complexes, complete individual discrimination of all 24 possible hexoses125 and 12 pentoses126 was recently demonstrated. The further development of such methods and their incorporation into workflows promises the accurate determination of glycopeptide structures from a much wider array of sources.
For linkage determination, tandem MS with collision-induced dissociation is currently the most commonly used technique. It has been shown that glycosidic linkage type can be differentiated in disaccharides regardless of the monosaccharide constituents present through MSn (n = 3–5) of 18O-labeled disaccharides in the negative ion mode.127,128 Through this methodology, stereochemistry (position of the linkage 1-2, 1-3, 1-4, 1-6) and anomericity (α or β) was successfully identified in larger oligosaccharides by their fragmentation into a series of disaccharide ion ladders through MSn (n = 3–5) via collision-induced dissociation and then compared to a spectral similarity score of known disaccharide standards.127,128 A complementary technique to assess linkage information in oligosaccharides employed the use of infrared multiple-photon dissociation (IRMPD) via a tunable CO2 laser (9.2–10.7 μm).129 As a proof of concept, a series of isomeric glucose homodimers were analyzed and it was observed that each isomer exhibited wavelength-dependent photodissociation.129 With this result, two isomeric glucose homotrimers were trapped and fragmented, in a Fourier-transform ion cyclotron resonance (FTICR) cell, to their respective disaccharide fragments, which were subsequently IR-irradiated by tunable wavelength dissociation.129 These disaccharide fragment ions were seen to behave similarly in their photodissociation patterns to the intact disaccharides,129 which indicates this methodology could possibly have applications for de novo linkage analysis of even larger oligosaccharides.
In addition to the challenges of determining basic glycan structures from sources whose biosynthetic pathways are not well understood, the ability to detect and distinguish polysialylated N-glycans with present analytical techniques remains difficult. One of the most common ionization mechanisms for MS of N-glycans is electrospray ionization (ESI); however, several disadvantages plague this ionization type, including in-source fragmentation that may lead to structural misidentification or even no identification based on poor sensitivity.130 A recent significant development to overcome this shortcoming is use of subambient pressure ionization with a nanoelectrospray (SPIN) source.130,131 In the SPIN source, the ESI emitter is moved from atmospheric pressure to the first vacuum stage of the mass spectrometer and is positioned at the entrance of the electrodynamic ion funnel to allow for the entire electrospray plume to be collected.130,131 With this new SPIN source that provides higher MS sensitivity and gentler ionization conditions at the MS interface as compared to traditional ESI sources, glycan coverage can be increased by 25% relative to conventional ionization techniques and heavily sialylated and polysialyated glycans were observed from human serum samples for the first time.130,131
With the tremendous structural complexity of N-glycans comes the need for robust fragmentation techniques that can provide rich structural information through MS. One such fragmentation method is electronic excitation dissociation (EED), which takes place at an electron energy of >9 eV.132 Optimized EED conditions were recently shown to provide much richer structural information (Figure 4) as compared to collision-induced dissociation (CID)-based methods for permethylated and reducing end-labeled glycans.132 EED is highly amenable to HPLC-based methods that allow for high-throughput glycomics as well as allowing for a great variety of metal charge carriers, since EED is a charge-remote process.
Although collision-induced dissociation (CID) remains one of the most popular forms of ion activation, the technique has some shortcomings when applied to glycoproteomics analyses. Specifically, glycosidic linkages are considerably weaker than peptide bonds, resulting in predominantly glycosidic fragments, with little information about the amino acid sequence.133 Electron transfer dissociation (ETD) is able to overcome these problems by being a nonergodic fragmentation mode that can retain the weaker glycosidic linkage to keep the glycan portion intact while fragmenting the peptide backbone.134 The negative-mode analogue of ETD has been applied in the field of proteomics but has yet to be applied to N-glycans or N-glycopeptides.135 One of the drawbacks of ETD is that precursor ions must have ≥2+ charge (or 2− charge for negative ETD) for species to be observable.136 ETD, however, has been proven to be more efficient as compared to CID.136 Site-specific glycosylation site occupancy has been observed in etanercept, a highly glycosylated protein, that contains multiple N- and O-glycosylation sites.134 It has also been demonstrated that sialic acid-containing N-glycans can provide neutral fragment losses from their charge-reduced species via ETD.137
Fortunately, recent developments have also made CID more attractive for glycoproteomics. The concept of energy-resolved CID, where collision energies are varied so that information on both the peptide and glycan portions can become available, has been one of these recent advancements.138–140 A tryptic N-glycosylated peptide was fragmented in a single MS/MS experiment, which allowed for the simultaneous acquisition of spectra at lower and higher collision energies (termed collision energy stepping CID).138 In a similar approach, energy-resolved CID studies were performed on various protonated N-glycopeptides that differed in their amino acid composition and charge states.139,140 These experiments demonstrate that the composition, charge state, and proton mobility of the precursor does influence the absolute collision energies needed for dissociation.139,140 In place of CID, higher energy collision dissociation (HCD), also a commonplace fragmentation method on commercial mass spectrometers, has also been used in an energy-resolved fashion for the identification of core fucosylated N-glycoproteins141 and intact N-glycopeptides from fetuin, α-1-acid glycoprotein, and ribonuclease B.142 All of these studies have shown that CID and/or HCD remain a promising tool in the fragmentation toolbox to assess both the glycan and peptide portions of a N-glycopeptides through selective fragmentation of each moiety.138–142
Despite the many new fragmentation techniques being developed, interpretation of the glycopeptide MS data remains challenging. High false discovery rates (FDR) make unequivocal identification of glycopeptides difficult. Fundamentally, the FDR can be addressed with ever more advanced computer software algorithms or with more fragmentation steps to generate solid data on both the peptide and the glycan sequence. A recent paper that includes both strategies was reported by Wu and co-workers. In this work, complex data-dependent decision trees of sequential fragmentation steps of glycopeptides with a HCD-product dependent-ETD/CID workflow was employed utilizing a trihybrid Orbitrap.143 The false discovery rate went down with more sequence data available and the data interpretation was facilitated with the Sweet-Heart software, which utilizes machine-learning algorithms to predict N-glycopeptides from HCD fragments. As the sheer amount and complexity of the data increase, so do the demands of the whole informatics pipeline,3 as will be discussed in more detail below.
Another appealing fragmentation type is ultraviolet photo-dissociation (UVPD) at shorter wavelengths such as 153 and 193 nm in a linear ion trap.144–146 While CID remains the most popular choice for tandem MS experiments in the field of glycoproteomics, and even with new advances such as the concept of energy-resolved fragmentation, it still is not ideal for assignment of a glycosylation site.144 Moreover, UVPD provides considerably more abundant fragmentation patterns as compared to CID, which should aid database searching algorithms.144 UVPD fragmentation has the added benefit of being on a compatible time scale with chromatographic elution volumes, thereby allowing coupling of this technique with LC separations.145 UVPD has also shown promise for isomeric resolution of glycan isomers.146 UVPD has been applied to O-glycopeptides derived from fetuin, and it was observed that a greater number of diagnostic fragments were produced in comparison to CID-based methods.144
Even with all these above-mentioned advances in MS, both pertaining to novel ionization methods as well as fragmentation modes, CID in both the positive and negative ion modes remains most common. Multidimensional MS-based methods that combine MALDI and/or ESI with tandem MS, sometimes referred to as sequential disassembly, will likely remain attractive to the analytical chemist for providing rich structural information on linkage and branching position of N-glycans.14,147–150 Although MS is capable of identifying the presence of isomers, if not necessarily discriminating between them, the increased availability of structurally well-defined authentic standards/synthetic glyconjugates is crucial for further spectral matching efforts to an unknown glycan.18 Such spectral libraries are starting to be developed by fragmenting sodiated oligosaccharides in the positive ion mode through sequential MS experiments to yield disaccharide fragment ions that are diagnostic of the glycosidic linkage present in a larger molecule.151
It has been observed that the negative ion mode provides many more diagnostic ions for N-glycans as compared to the positive ion mode.152 This can largely be attributed to the fact that the deprotonated ions fragment in a much more specific manner, usually through cross-ring fragments, as compared to the positive-ion mode.152 Even more interesting, as it relates to the negative ion mode CID fragmentation of N-glycans, is the result that glycans released via endoH and endoS provide the same diagnostic fragment ions and thus structural information as those intact glycans released by the much more commonly used PNGase F discussed earlier.152 In a separate study that utilized the negative ion mode for glycoproteomics and low collision energies, the peptide sequence had a tremendous impact on the fragment ions that were observed.153 Specifically, glycan fragment ions were detected when short peptide sequences were present, and interestingly, these fragment ions were only seen in the negative ion mode from a deprotonated precursor ion and not in the positive ion mode, while secondary cleavage ions were seen when long peptide sequences were present.153
As mentioned briefly above, multiple reaction monitoring, MRM, is a sensitive and selective tandem MS-based quantitative technique that has been used for many years successfully in a range of applications including absolute quantification of biological compounds in complex mixtures. This technique is usually performed on a triple-quadrupole instrument, where the analyte of interest is compared against the MS responses of known standards. In contrast to the standard LC–MS, which scans all ions within a certain scan range, the instrument is programmed in MRM to specifically look for a select number of predetermined MS and MS2 ions (transitions). The first mass analyzer (Q1) is set to only transmit ions of interest, the second mass analyzer (Q2) fragments the ions, and the third mass analyzer (Q3) is set to transmit diagnostic MS fragments only. MRM does not require, but is compatible with, isotope-labeled standards. The use of label-free MRM on free human milk oligosaccharides, monoclonals, and the glycosylation of the top 8 human glycoproteins in human serum was recently demonstrated.154–156 In a similar approach, sialylated linkage isomers were separated on HILIC columns and linkage-specific transitions were monitored via liquid chromatography-selected reaction monitoring (LC-SRM).157 The sensitivity of MRM was also clearly demonstrated in a quantitative study of human permethylated N-glycans.158 The quantification was reliable down to as little as 100th of a microliter of human serum.
Ion mobility-mass spectrometry (IM-MS) is an emerging analytical technique for isomer discrimination on the order of milliseconds. IM-MS offers information on the 3-D shape of a molecule from its gas-phase rotationally averaged collision cross-section (CCS values) with a buffer/drift gas. In the field of glycomics and glycoproteomics, where definitive structural information is necessary, IM-MS shows promise and several advantages over traditional MS-based techniques for the discrimination of linkage and position isomers, identification of glycosylation sites, and information on potential conformational changes that are induced from protein-glycan interactions.159 Several ion mobility spectrometry instrument types exist currently, including traveling-wave ion mobility spectrometry (TW-IMS), drift tube ion mobility spectrometry (DT-IMS), high-field asymmetric ion mobility spectrometry (FA-IMS), and trapped-ion mobility spectrometry (T-IMS).159
Through the use of TW-IM-MS, singly charged protonated epimeric glycopeptides, an alpha N-acetyl D-glucosamine (α-D-GlcNAc) versus alpha N-acetyl D-galactosamine (α-D-GalNAc), both linked to a threonine residue, were discriminated based on their collision cross sections.160 Additionaly, collision-induced dissociation (CID) prior to ion mobility separation was shown useful to discriminate singly charged epimeric oxonium ions from D-GalNAc and D-GlcNAc glycoforms.160
Sialic acid linkage isomers have been discriminated from one another by the use of TW-IM-MS, where the α-2-3 and α-2-6 linked sialic acid isomer ions showed diagnostic dehydrated, singly protonated, ion CCS values.161 In this study, it was also demonstrated that fragment ions produced via CID from a tryptic digest of a glycopeptide indicated that α2,3 and α2,6 sialylation ratios in their arrival time distributions (ATD) can provide additional information on the presence, or absence, of antennary sialo-glycoforms.161 With recent advancements in automated carbohydrate synthesis, authentic standards should soon become more readily available and thereby allow in the coming years for the definitive identification of unknown multiantennary sialic acid isomers based on the matching of cross section and m/z values provided from such experiments.
Isomeric glycopeptides that share the same sequence but differ in the positioning of their glycan attachment can be discriminated both individually, and in a mixture, from their ATD profiles.162 However, in certain instances, IM-MS cannot differentiate intact glycopeptides ions alone, but after CID, trisaccharide fragment ions can prove diagnostic in their ATD and CCS values in the discrimination of α2,3 and α2,6 sialic acid linkage isomers, as shown in Figure 5. Such ion mobility spectrometry techniques remain quite attractive in the field of glycomics because of their ability to resolve certain isomeric glycoforms both before and after collision-induced dissociation, which allows for multiple forms of glycan discrimination in one experiment.
As a means to increase the separation ability of gas-phase carbohydrate isomers, metal salt adducts, in both the positive and negative ion modes (positive ion adducts, proton, sodium, lithium, potassium, barium, calcium, beryllium, magnesium; negative ion adducts, deprotonated, phosphate, chloride), either through a first stage of ion mobility separation or after collision-induced dissociation, have been applied for the discrimination of both high-mannose N-glycans163,164 and underivatized oligosaccharides165–167 with IM-MS. Whereas an individual glycan or simpler carbohydrate may not be discriminated from other similar isomers based on their cross sections alone, when bound to a charged adduct their fragment ions may indeed be diagnostic. This added layer of gas-phase separation provided by ion mobility spectrometry-based methods is a powerful advancement over traditional MS approaches where rigorous fragmentation must be performed in the hope of delineating isomeric glycans. More importantly, IM-MS holds the power to resolve mixtures of glycans based on differences in their cross sections as seen by multiple drift time features, whereas conventional MS will provide an average of ion intensities at a given m/z for however many glycan structures exist in the mixture.
Despite the many advantages of coupling IMS to an analytical workflow, one important drawback that is associated with IM-MS as a complete isomeric discriminatory analytical technique is that a single analyte ion may produce multiple drift time peaks/ATD features based on the presence of multiple-ion conformations (e.g., a cation that may adduct at multiple hydroxyl groups on a single molecule).163–167 This presents the need for complementary or hyphenated analytical techniques to definitely identify a single N-glycan isomer. Additional experiments have shown that the choice of metal adduct does indeed influence the fragmentation pattern of high-mannose N-glycans,168 and such a multidimensional approach may prove useful to discriminate difficult mixtures of glycoforms.
With the improved separation ability of ion mobility spectrometry, as compared to traditional MS approaches, predictive information on the identity of an unknown glycan may be obtained from an experiment that diagnoses its cross section shape (Å) versus its mass-to-charge (m/z) as compared to other similar glycan structures.169 For example, more complex branched glycan structures should exhibit a larger collision cross section as compared to more linearly shaped ones. With a home-built drift tube ion-mobility instrument, 117 glycopeptides that contain 27 glycan structures for the chicken ovomucoid were observed.170 With IM-MS alone, glycosylation at a specific site on a protein can be delineated, which provides an elegant complement to existing MS-based assays.170
The choice of buffer/drift gas used in IM-MS experiments of complex carbohydrates plays an important role in the calibration of collision cross section values.171 When nitrogen is used, smaller errors are observed as compared to when helium is selected as the buffer/drift gas for the study of negatively charged N-glycans.171 With a TW-IMS instrument, the choice of calibrant for calculating CCS values plays a tremendous role. Ideally, the choice of calibrant should be structurally similar to the analyte of interest. For the study of N-glycans, dextran ladders have shown potential as a quite suitable calibrant in terms of retention time in liquid chromatography as well as m/z and CCS values in ion mobility spectrometry.171 These results clearly demonstrate that careful consideration must be paid for the selection of experimental parameters to maximize the possibility of gas-phase ion discrimination of N-glycan isomers and these experiments will also become more reliable as a wider range of authentic standards become available.
Ion mobility spectrometry has also shown promise in the area of de novo carbohydrate sequencing, the structural identification of carbohydrates from unknown origins without knowledge of biosynthetic pathways to inform the potential monosaccharide subunits and glycosidic linkages present in an analyte. As was discussed above, IM-MS has been used to discriminate oligosaccharide isomers either from their metal adducted collision cross sections or from their fragment ion cross sections following collision-induced dissociation.165–167 Nonetheless, certain limitations still remain as to whether or not fragment ions from oligosaccharide isomers can be different enough in their shape/mobility features to be diagnostic. One improvement to overcome this challenge is the concept of energy-resolved ion mobility-mass spectrometry, where a mixture of isomeric oligosaccharides will fragment uniquely when activated at different energies via collision-induced dissociation.172 Four isomeric trisaccharides were investigated and it was observed they each had distinct alkali-metal adduct dependent dissociation energies.172 For this energy-resolved IM-MS technique to be employed, two factors must hold true for the analytes of interest: the gas-phase analytes cannot change their conformation during collision-induced dissociation and they must exhibit unique gas-phase stabilities in order to be energy-resolved.172
Ion mobility-mass spectrometry has also shown promise in the chiral discrimination of a complete set of individual monosaccharide isomers without the need for collision-induced dissociation or derivatization.173 D/L enantiomers can be resolved by the creation of chiral noncovalent gas-phase complexes containing various divalent metal cations and L-amino acids, that will differ in their respective collision cross sections based on the unique complex formation that is diagnostic for a given monosaccharide analyte.173 One of the major challenges of this form of chiral discrimination with IM-MS is the empirical determination of what chiral complexes are sufficient to create large enough differences in gas-phase interactions to be diagnostic. If acidic hydrolysis methods can be improved to allow for stoichiometric recovery of monosaccharides, this method173 can allow for the de novo identification of the monosaccharide constituents from a larger, more complex biological sample.
LC continues to play a very important role in glycomic and glycoproteomic research with its different retention modes, column dimensions, and formats and its use as either an analytical profiling techniques or a micropreparatory tool.186,187 As discussed already in a previous section, derivatization of glycans with suitable fluorogenic reagents is today at the heart of numerous routine applications, although the uses of anion-exchange chromatography with pulsed amperometric detection still appear in the literature for native N-glycans.174,175 The use of fluorescence derivatization has become routine in conjunction with conventional HPLC columns and increasingly with the UHPLC (ultrahigh performance liquid chromatography) columns packed with particles smaller than 2 μm in diameter. The fluorescence derivatization thus safely pushed analytical glycoscience to the low picomole range while using conventional detectors. These applications chiefly use 2-aminobenzamide, 2-aminopyridine, and 1-phenyl-3-methyl-5-pyrazolone.90 However, with rationally designed better labeling reagents, detection limits can be significantly pushed down further.61,90 While the earlier application of derivatization/HPLC utilized the tag hydrophobicities in the reversed-phase mode, the current availability of HILIC phase systems makes the reversed-phase mode less popular, as the interactions with the polar-phase structural moieties can be more effectively utilized in the resolution of underivatized monosaccharides,176 native or derivatized glycans, or even isomeric glycopeptides that contain the same peptide backbone.177 An illustrative example of the state-of-the-art UHPLC commercial technology used in a HILIC separation of fluorescently labeled IgG glycans, isolated from human plasma, is shown in Figure 6.
In coupling chromatographic separations to MS, capillary columns featuring drastically reduced volumetric flow rates must be applied. While the column inner diameters are here typically less than 100 μm, the capillaries of different lengths can be filled with the particles that were surface-modified for the use of reversed-phase chromatography (e.g., for separations of glycopeptides or permethylated glycans), porous graphitized carbon (glycopeptides and either derivatized or native glycans), or HILIC-type capillary LC (increasingly used for various glycoconjugates with or without tags). The alternatives to capillaries are microfluidic chips45,179,180 where the separatory channels can be filled with selective sorption matrixes. In addition, other sample treatment options, such as immobilizations, derivatization, or enzymatic reactors, can be incorporated on the same microchip as the LC separation channel. Besides chromatography, tagging procedures could also be beneficial for the sake of improved MS detection or fragmentation. The “MS-friendly” mobile phases must be applied for any mode of LC with MS, so that compromise solutions are often seen in practice. Two-dimensional LC, where an analyte will pass through two different columns with different types of stationary phase is an interesting concept for highly complex samples. This was recently exemplified with sialylated N-glycans as their 2-aminobenzoic acid-derivatized forms, which were passed first through hydrophilic interaction anion-exchange column, then into a graphitized carbon-filled column and, finally, to a mass spectrometer.181 Naturally, mobile-phase compatibility is a significant issue in any 2D separation system. Yet another study182 used a sequential arrangement of LC columns in LC–MS of O- and N-linked glycopeptides.
A HILIC-based system was used in the separation of isomeric glycopeptides containing the same peptide type.157 One of the challenging tasks, as already mentioned earlier, has been to resolve sialic acid linkage isomers of some importance to cancer biology. With selected reaction monitoring (SRM) via MS after an LC separation, α2,3- and α2,6-linked isomers can be resolved through the use of specialty packing/derivatization. The advantage of SRM is a sensitivity enhancement, but reproducible retention times must be ensured across different runs and MS conditions.157
The enormous complexity of glycobiologically interesting mixtures challenge the sorption selectivity and column efficiency of all current separation systems and their best combinations with MS techniques. The current HILIC and PGC packings represent great examples of separation selectivity, but this is still without the high column efficiencies demonstrated in the cutting-edge ultrahigh-pressure operation shown with other biological mixtures.183,184 To our knowledge, similar systems were not yet demonstrated in the field of glycoscience. An interesting variation on how to conduct complex analyses of both the peptides and glycans in a single setup (using LC–IM-MS) was proposed by Lareau et al.,185 but this proposition still needs to be tested across a range of biological samples.
A high-throughput glycomic platform involving commercial UHPLC instrument coupled to both fluorescence detector and MS at its heart71 exemplifies the needed efforts to screen numerous samples and quantitatively compare them. The high throughput was enabled through an incorporation of a new derivatization route using 6-aminoquinolyl-N-hydroxysuccinimidyl carbamate reacting with glycosylamine-reactive groups.
Micropreparative HPLC can potentially be useful to isolate glycoconjugates from complex biological materials to be available as analytical standards or be used for biological screening purposes. Even conventional analytical HPLC columns with inner diameters of 4.6 mm can be used to provide pure glycans or glycopeptides at microgram to milligram amounts. The structurally close mixture components can often be resolved through the use of alternate-pump recycling HPLC186 and recovered in the pure state. This system was recently utilized in developing a protocol for the purification of synthesized carbohydrates.187
Capillary electrophoresis with laser-induced fluorescence (CE-LIF) detection for carbohydrates is nearly as old as the first use of MALDI and ESI techniques in biomolecular analysis. Since then, CE-LIF has enjoyed a steadily rising acceptance in academic and industrial laboratories together with some modifications with regard to the separation conditions (coated vs uncoated capillary walls, free medium vs polymer gel additives, different types of buffers, etc.), exploring different fluorescent tags, and the separation formats (e.g., capillaries vs microfabricated channels). During the recent years, there appears to be a significant upswing of interest in CE-based determinations, which is in no small part due to a trend in overcoming the difficulties in detection. CE in its most widely practiced mode, capillary zone electrophoresis, provides unprecedented separation efficiencies and resolution of charged biomolecules but at the expense of relatively small samples (nanoliter volumes) to be introduced at the capillary inlet, with the corresponding needs for ultrasensitive detection. These sensitivity requirements have been relatively easily met with the LIF detection using appropriate fluorescent tags, but not until recently, with the promise of MS detection.
With regards to the spectroscopic tags (reviewed previously10,14), there have been efforts to replace or supplement procedures using 8-aminopyrene-1,3,6-trisulfonic acid (APTS), the reagent introduced by Guttman and co-workers93 that has now been established firmly in numerous laboratories. These efforts will likely continue into the future, as seen with the very recent example188 of a carbonyl-reactive tandem mass tagging approach for the benefit of multiplex CE–MS. Further developments in CE–MS toward maximizing both CE’s separation potential and MS sensitivity could have substantial impact in glycoanalysis.
With regard to the routine uses of CE-LIF with APTS labeling, numerous studies attest to its capability of resolving isomeric glycans and the high reproducibility of its contemporary versions. This is particularly evident in various analyses of biopharmaceuticals, where robust glycan profiling has become essential. Recently, 20 independent laboratories representing biopharmaceutical companies, contract laboratories, academic institutions, and regulatory agencies participated in a comparative analytical trial on the N-glycan analysis by CE-LIF with highly satisfactory results; a close comparison with a competitive method (UHPLC) is reportedly in progress, but the results were not available at this time as of yet.189 Additional aspects of biopharmaceutically important applications of CE-LIF profiling and quantification, such as sample preparation, were also addressed.190 While these methodologies usually concern the glycan sample components with known or predictable structure, there is still a need for authentic glycan standards and N-glycan databases in this area.191
An apparent advantage of CE-LIF over most other approaches is its multiplexing capability. This is evident in both biopharmaceutical analyses and in the applications of biomedical and clinical interest. It is perhaps most succinctly demonstrated by two papers of Ruhaak et al.26,192 profiling a large number of individuals (donors of plasma samples) for various physiological parameters.
For many years, an obvious disadvantage of CE-LIF techniques has been its limited scope in identifying glycans in complex mixtures such as in physiological fluids and tissue extracts in disease biomarker studies. The difficulties are further underscored by the limited availability of authentic glycan standards that would verify, or even suggest a presence or absence of any glycan component by a comigration in the high-resolving CE systems. In their minireview/perspective, Mittermayr and co-workers193 describe some alternative routes leading to glycan structural elucidation from CE-based measurements: sequential glycosidase-aided digestions using comparative CE runs; use of lectin affinity; and the use of orthogonal separation principles based on chromatography, which all involve complex procedures short of the availability of direct CE/tandem MS. However, seemingly tedious, identification of N-glycans in a sample-limited complex mixture, such as human blood serum, could still be accomplished through a series of comparative runs of the MS-characterized N-glycans originated from “standard glycoproteins” (fetuin, haptoglobin, α-1 acid glycoprotein, IgG, and ribonuclease B).89 Using this strategy, 37 unique structures were assigned to 52 peaks recorded in high-resolution runs.
One of the advantages of the nearly universally employed APTS fluorescent label is its predictable labeling at the sugars’ reducing end and its triply negative charge leading to the orderly electromigration due to the glycans’ hydrodynamic radii. However, under the otherwise analytically favorable buffer composition, sialylated glycans present some difficulties due to their fast migration. This situation has led many investigators to enzymatically remove sialic acid residues to fit the CE electromigration windows. Unfortunately, this is at the account of the biologically desirable information on the presence of sialylation and its linkage isomerism. As shown in the recent publications,89,194 this problem can be significantly reduced by methylamidation of the sialic residues, with the added advantage of resolving many α2,3- and α2,6-substituted glycans during the microchip CE runs. In fact, after the double derivatization (with NH2CH3 and APTS) the glycans migrate predictably according to their molecular mass, while the separation range has been significantly enhanced (see Figure 7).
Further miniaturization of the CE-based separations to the microchip format has been an evident trend in the field of bioanalytical chemistry for a number of years. From the first application of microchip electrophoresis to glycan analysis in 2007,195 the microchip design has now undergone major modifications toward a serpentine channel with asymmetrically tapered turns and a separation length of 22 cm, fabricated in glass substrates.196 With the separation window between 60 and 130 s, the system features efficiencies of unprecedented 800 000 theoretical plates. Additionally, the microfabricated systems of this type are known to exhibit very high reproducibility in terms of sample introduction and migration times. The general attributes of conventional CE usually appear to translate well into the microchip format.
Combining effectively CE, a separation method with high resolving power and the proven capability to resolve isomers, with the highly informative tandem MS techniques has been a target of numerous investigations already since the early 1990s, as reviewed previously.12,193,197 With the inherent limitation of small sample loads in CE and its microfabricated versions, various compromises must often be made in terms of either reduced separation capabilities or detection sensitivity. Most preconcentration remedies, such as stacking or solute trapping, have been only marginally successful. The choice of “MS-friendly” buffers (further limited in terms of buffer additives to improve CE resolution) can also compromise separations. These limitations notwithstanding, some promising results have been achieved recently, as briefly reviewed below. The gradually improving MS detection capabilities of the most modern instruments further add to the future promise of CE/tandem MS.
In structural investigations of glycopeptide mixtures and complex glycan pools, different tagging strategies can often be beneficial to the overall aims of CE-MS. A uniform tag, such as the derivatization product of APTS labeling, is clearly beneficial to efficient and predictable electromigration during CE: without it, the neutral glycans in a mixture cannot be effectively analyzed. On the other hand, the APTS-derivatized glycans then must be detected, through ESI, in the negative-ion mode, which is inherently less sensitive. Nevertheless, a carefully optimized CE (using even a polymer additive) was successfully coupled to a time-of-flight mass spectrometer, via ESI, with a respectable detection sensitivity for the APTS-labeled biantennary glycans of biotechnological interest198,199
A further quest for suitable cationic labels in CE-MS continues to this date. A promising example of these activities is the use of a recently developed aminoxy-TMT labeling approach in CE-MS,188 demonstrating MS2 recordings with multiantennary N-glycans. Additionally, the multiplexing capability of this approach is also a bonus.
A coupling between CE and MALDI-MS involves an off-line fraction deposition on a moving surface target (e.g., MALDI plates), as reviewed previously, and apparently still used in some laboratories. Combining CE with MS through ESI has undergone different modifications more recently; some of these could be crucial to the future of CE-MS as a more widely accepted tool in analytical glycoscience. Two basic modes of the CE-ESI-MS coupling have traditionally been (a) coaxial sheath-flow attachment isolating electrically the CE and MS parts and (b) sheathless flow arrangement using special modifications of a capillary tip. While the sheath-flow arrangement is operationally robust, its drawback is a solute dilution prior to MS-detection. The sheathless approaches appear technically complicated in terms of a capillary tip fabrication and operational conditions. During the recent period, newer variants of the CE-ESI-MS coupling were described: a flow-through microvial interface;200 a new type of sheathless attachment, which is compatible with nano-ESI;201 and a novel electrokinetically pumped sheath-liquid ESI interface.202–204 The latter interface is featured and described in Figure 8. This arrangement was effectively utilized in CE-MS analysis of the negatively charged heparin oligosaccharides.
Many types of databases and software already exist today or are under development to model carbohydrate structure, interactions, identification, annotation, and analysis (see Table 1 for examples). Several recent reviews and protocols in the field offer more comprehensive coverage of glycoproteomic databases, software for released glycan and glycopeptide annotation, biomarker detection, and pathway analysis.150,205–209
Databases may be used to identify glycans from analytical data; once a tentative glycan structure is identified, it is possible to query other databases for metadata associated with the glycan, such as binding partners, lipids, proteins, glycosyl-transferases, diseases, possible 3D structures, etc.208 Repository databases that store large LC–MS data sets in order to freely allow the raw data to be explored by other scientists have been established in the proteomics and glycoproteomics field (ProteomeXchange/PRIDE) but less so in the glycomics field. Difficulties have arisen, though, in cross-referencing proteomics and glycomics databases, since during both PNGase F release of N-glycans and reductive β-elimination of O-glycans, the protein information is lost. In addition, the databases are often written in such a way that they cannot encompass both glycan and protein information.
Some glycomic information can be accessed in the large proteomic databases such as UniProt, Protein DataBase, and Protein Atlas (Table 1). The bioinformatics resource portal ExPASy also has a section devoted to glycan databases and software tools (Table 1). With the number of glycan databases and specialized resources growing, practical protocols of glycoinformatics may be good starting points for the novice in the field.207,208 Lisacek and co-workers provide a rather comprehensive list of different types of glycan databases, some of which include the following functions: (a) determine monosaccharide compositions from MS data (GlycoMod), (b) predict MS/MS spectra from glycan structures (Glyco-Workbench), (c) determine glycan structures from MS and MS/MS data (UniCarb-DB), (d) determine N-glycan structures from UHPLC/CE data (GlycoBase), and (e) find publications of glycan structures and associations with proteins and other associated information (UniCarbKB).208
GlycoWorkBench is one of the most straightforward and appreciated tools in MS based glycomics due to its graphic interface211 (Table 1). To annotate a MS or MS/MS scan, the software needs a peak list (imported or generated in the software) and a pool of potential glycan candidate structures. These structures are drawn by the user or provided by the included databases of glycan structures or a user generated database. Recently, GRITS toolbox, freely available software for glycomics data processing and archiving, was released (http://www.grits-toolbox.org/). Some of the features include automatic annotation of MALDI or LC–MS/MS glycomic data and the possibility to compare several MS runs. The automatic annotation of MS spectra uses a set of glycan databases, which have been curated by experts (1693 structures in total, 1191 1693 N-glycans, 218 O-glycans, 286 GSLs) or are user defined.
Several attempts to automate glycopeptide identification in large data sets have now been attempted by both commercial and open-source developers creating software such as Byonic,212 Protein Prospector,137 GPQuest,213 GlycopeptideID,214 SweetHeart,215 and GlycoFragWork.216 The fidelity of the annotation can be improved by analyzing released glycans, formerly glycosylated peptides, and glycopeptides in parallel. The data is then combined to accurately annotate the glycosylation sites with plausible glycans. Even so, micro- and macro-heterogeneities can still be missed since each glycosylation site may carry several different glycans.
One of the most popular software programs for glycopeptide annotation from mass spectrometry data is the proprietary proteomic search engine Byonic (standalone commercial software or bundled with Proteome Discoverer, Thermo Fisher Scientific) that allows semiautomatic annotation of N- and O-glycopeptides. Byonic scores, ranks, and identifies glycopeptides by searching separate protein and glycan databases. Glycans are specified as monosaccharide compositions. Predicted glycans are placed on prospective N-glycan motifs (NXS/T, X ≠ P). The scores take several types of ions into account, including oxonium ions and for N-glycopeptides, the glycosylated peptide fragments (b- and y-ions). Recently, the automatic glycopeptide annotation accuracy of the uncharacterized glycoprotein Basigin by Byonic was compared to that of a human expert.217 The micro and macro heterogeneous N-glycosylation sites were annotated using Byonic with or without a background of complex peptides. The accuracy and coverage were both above 80%. The false discovery rate was below 1%. The N-glycan database of Basigin that was used to increase the accuracy of the Byonic annotation was annotated manually. The authors indicate that automated glycopeptide annotation may still have difficulties with ambiguous monosaccharide residues (near isobaric masses) and peptide modifications with masses that coincide with monosaccharide masses.
Recombinant proteins such as monoclonal antibodies (mAbs), growth factors, hormones, cytokines, enzymes, and vaccine components are examples of important licensed therapeutic proteins that account for an increasing share of the revenues in the pharmaceutical industry. In 2013, global sales of mAbs were nearly $75 billion, representing half of the total sales of all biopharmaceutical products.218 Since the approval of the first commercial therapeutic mAb in 1986, the numbers have increased substantially, amounting to 47 mAbs approved in the USA and Europe as of 2014.218 With the current approval rate, more than 70 mAbs are expected to be on the market by 2020.218 Since therapeutic glycoproteins are much more complex compared to small molecules, the development and approval of generic (or follow-on) biopharmaceuticals, biosimilars, has been greatly delayed. The first biosimilars were not approved in Europe until 2013 and in the USA in 2015.
Therapeutic glycoproteins are inherently difficult to analyze. The most potent and difficult modification of proteins is probably their glycosylation. The glycosylation of therapeutic glycoproteins may affect serum half-life, in vivo activity, safety, efficacy, immune response, and solubility.219 The effector function of mAbs are also highly dependent on the glycosylation of its Fc portion, which interacts with Fc receptors and other lectins. Tailored glycosylation can therefore contribute to proor anti-inflammatory effect of mAbs.220 The U.S. Food and Drug Administration (FDA), the European Medicines Agency (EMA), and the International Conference on Harmonisation of Technical Requirements for Registration of Pharmaceuticals for Human Use (ICH) have therefore released guidelines and regulations regarding the documentation of the glycosylation of therapeutic glycoproteins.
Serum half-life of most therapeutic proteins is highly dependent on their sialylation; nonsialylated proteins are cleared from the circulation via lectin receptors in the liver.221 Several commercial variants of the hormone eryth-ropoetin (EPO) today exists where some have been genetically modified to harbor more glycosylation sites and have thereby greatly improved half-life in circulation.222
The glycosylation analysis of therapeutic proteins should preferably be automated, cost-effective, fast and high-throughput while still fulfilling the required levels of accuracy and reproducibility.102 There is no “gold standard”, but so far, the most common method to establish the mAb glycan profile is to analyze reductively aminated N-glycans with HILIC HPLC or UHPLC with fluorescence detection. This general method has high throughput, precision, and accuracy and also allows for quantification. Recently, a number of comparative papers have been published which benchmark both chromatographic and MS-based glycan analysis of mAbs.223–225 Reusch and co-workers compared seven chromatographic methods (CE and HPLC variants with fluorescence detection) and 11 MS methods in the N-glycan analysis of mAbs in two accompanying papers. The chromatographic methods showed similar results for quantitation, accuracy, precision, and separation and were considered suitable for IgG1 glycosylation analysis. The most prominent glycans could be deduced and quantified with high accuracy and precision in all but one MS method. The N-glycan profiles obtained with the MS-based methods were overall very similar compared to the chromatographic methods. The two comprehensive studies together recommend the HILIC (2-AB) HPLC method as a “release method”; it is easier to validate in a regulated good manufacturing practice (GMP) environment and displays excellent robustness, accuracy, and reproducibility.223
Some therapeutic glycoproteins are much more challenging to analyze than mAbs. One example is Etanercept (TNFα Fc fusion protein), a therapeutic glycoprotein with three major N- and 13 O-linked glycosylation sites. Houel and co-workers analyzed the N-glycans using UHPLC-(HILIC) with 2-AB labeled glycans, released with PNGase F.134 The structures were validated with exoglycosidase digestions. Glycopeptides digested with trypsin and PNGase F were further analyzed for their O-glycan sites. Parallel LC–MS fragmentation with ETD and CID (Waters MSE) allowed the elucidation of 12 O-linked sites. Taken together, an impressive characterization was achieved, although compared to mAb glycan profiling, the approach was much more laborious.
The multiple involvement of glycosylated proteins in cellular functions, cell growth and differentiation, cell-to-cell recognition and cellular adhesion, and metastasis has been recognized for many years. Accordingly, glycosylation changes have been associated with a multitude of human diseases for a very long time.226 From the fundamental errors of glycosylation associated with many newly recognized human glycosylation disorders,227 to autoimmune diseases,228 diabetic complications and metabolic syndrome,210 and many types of malignancies, the literature accounts associating certain glycosylated proteins with a disease and its stages have become so extensive that they now fall outside the scope of this review. Consequently, we only select the references pointing to certain trends leading to the discovery of disease biomarkers, new bioanalytical strategies, and the efforts to profile glycopeptides or glycan pools to meet the diagnostic and prognostic goals of future biomedicine. The general importance of O-glycosylation notwithstanding, the glycomic and glycoproteomic studies concerning N-glycoproteomes still appear more frequently in the area of biomarkers. This is likely due to both the structural uncertainties of O-glycosylation and clear methodological advances realized with N-glycans to date.
The more mature glycomic profiling techniques based on UHPLC, MALDI-MS, or multiplex CE-LIF methodologies can now be fully utilized in large-scale screening efforts across different populations. This has been exemplified by plasma analyses (or those of its preselected proteins such as IgG) of thousands of individuals of different ethnicities and incidences of hypertension,229 correlation between liver glycogenes and certain N-glycan structures,230 glycosylation trends in different diabetic patients,231 and the study of genetic loci connected with glycosylation of IgG232,233 and by association with various autoimmune and inflammatory diseases and/or hematological cancers.
Targeted biomarker discovery has been increasingly pursued as based on a selection of certain glycoproteins that have a known or suspected association with a human condition or physiological parameters. In the studies of Ruhaak and co-workers,26 IgG, IgA, and alpha-1-antitrypsin were selected as enriched plasma fractions to study the dynamic changes in N-glycosylation due to gender, age, and pregnancy status. Multiplex CE-LIF was utilized to screen a large number of individuals toward meaningful statistical evaluations.
The investigations linking aberrant glycosylation and cancer have been on a rapid increase during recent years. Various studies utilize traditionally cancer cell lines, tissue specimens, and different physiological fluids, most commonly blood serum or plasma. While the cancer-related glycoproteins entering the blood circulation from a relevant cancer-affected organ may only be a tiny proportion of the serum proteome233 and thus be barely detectable without a serious preconcentration, the currently available glycomic and glycoproteomic technologies can still measure meaningful glycosylation changes due to the immune system responses or alterations in the acute phase proteins; this fact now seems reflected in the current approaches in the area. As examples of targeted glycomic profiling, haptoglobin was affinity-isolated from serum samples prior to the observation of different glycosylation levels of fucosylated tri- and tetra-antennary glycans in liver cirrhosis and hepatocellular cancer;23 although different profiling techniques were utilized, a similar general approach was used by other groups.234,235 Both targeted and global LC–ESI-MS glycan profiling for early stage of hepatocellular carcinoma was statistically evaluated.236
While the glycomic changes in cancer observed by different groups due to immunoglobulins (mostly biantennary and bisecting structures) can be informative in their own right,237 there now seems to be a shift of interest toward tri- and tetra-antennary structures and their highly variable substitutions by fucosyl and sialyl groups. This has been seen with the smoking and lung cancer-induced changes in N-glycosylation of serum proteins,238 ovarian cancer,239,240 metastatic pancreatic cancer,241 predictions in castration-resistant prostate cancer,242 and breast cancer.243 Because of a very extensive number of isomeric possibilities, this area appears to be among the most challenging in the field of glycoanalysis and a wider variety of authentic standards are still needed to push this area forward.
To address the isomerism issues with larger N-glycan structures, analytical information from different techniques needs to be collected and combined. Besides the analysis of different fucosyl isomers, it will be essential to distinguish among different combinations of α2,3- and α2,6- possible linkages of sialic acids. An initial step this direction is indicated in Figure 9, which depicts an annotated microchip electropherogram of serum N-glycans in connection with colorectal cancer194 obtained through a combination of MS data and electrophoretic resolution of samples derivatized through methylamidation. However, for the more challenging isomeric variations, additional suitable combinations of LC–MS or CE–MS must still be developed.
Recovering proteins for the N-linked glycan analyses from formalin-fixed parafin-embedded tissues has been a major technical improvement in the field of biomarker research.242,244 While fresh or frozen tissues are preferred for analyses whenever feasible, this type of sample recovery now allows researchers to utilize the specimens previously stored in the sample banks for histological and pathological investigations. It has been encouraging that MS and CE studies are still feasible from the extracts of these “rejuvenated” samples. However, a most remarkable advance of the recent period has been done via MALDI-MS imaging (MALDI-MSI). With this methodology, both tissue staining of the protein and glycan profiles can be overlaid as demonstrated in the 2-dimensional distribution mapping of N-glycans on cancer tissues (Figure 10).32,33
Among other post-translational modifications of proteins involved in neurodegenerative diseases, glycosylation is garnering increased attention. A particular driving force of this interest is the discovery of prognostic biomarkers of neurodegenerative diseases, especially the prevalent Alzheimer disease (AD), since it is believed that early treatment is crucial in order to achieve treatment success. Unfortunately, the success in finding biomarkers of AD has been relatively limited to date.245,246 Interestingly, many of the proteins believed to be involved in the etiology of AD carry abundant PTMs such as phosphorylation and glycosylation and may also directly influence the activity of glycosyltransferases and glycosidases in the brain.247,248 To our knowledge, only a few attempts to find novel glycan biomarkers of human AD and various mouse neurodegenerative disease models have been published recently.249,250
The search for glycosylation-associated biomarkers is likely to grow in importance due to already proven connections to inflammatory diseases and cancer. With the current emphasis in personalized medicine, still an area largely dominated by genetic biomarkers,251 there is a significant rationale and opportunities for development of precise and robust N-glycan profiling methodologies. Moreover, the current emphasis on merging the structural information obtained through glycoproteomic and glycomic measurements will likely enhance the overall understanding of the differences between physiological and pathological states as well as the main genetic and environmental attributes of biochemical individuality. A recent minireview124 provides an enthusiastic assessment of the analytical technologies and approaches to unravel the complexities of human N-glycoproteome and its regulation in health and disease. Several pathway databases are currently available that may assist in potential glycan biomarker discovery and biological etiology including KEGG GLYCAN (http://www.genome.jp/kegg/glycan), MetaCyc (http://metacyc.org/), Carbohydrate-Active enZYmes Database (CAZy) (http://www.cazy.org/), UniCarbKB (http://www.unicarbkb.org/), and STRING (http://string-db.org/).
Bacteria can produce a tremendous variety of N-glycans. Eukaryotic N-glycosylation is an essential process, whereas it is believed to be inessential in bacteria and species-dependent in Archaea.252 Even more interesting is that in Archaea the linking monosaccharide to the peptide portion can vary from N-acetylgalactosamine, N-acetylglucosamine, glucose, to even other hexoses.252 Similar to eukaryotes, bacteria have their protein N-linked glycosylation occur before complete folding of substrate proteins.253 While the most studied case of N-linked glycosylation in bacteria is that of Campylobacter jejuni252 whose heptasaccharide glycan plays a role in its pathogenicity,253 other bacteria, such as Helicobacter pylori and Neisseria meningitides, use N-glycans to remain hidden, while Burkholderia cenocepacia and Francisella tularensis use N-glycans to communicate with their host.254 Through a MS-based approach, 618 glycosylated residues were found that belong to 149 proteins in E. coli.97 It was demonstrated that protein glycosylation in enterotoxigenic E. coli plays a role both in virulence as well as cellular physiology. In Halobacterium salinarum, it was found that two unique N-glycans, one linked via N-acetylgalactosamine and the other via glucose, were attached to a single glycoprotein.252 Interestingly, when Haloferax volcanii is grown under varying sodium chloride salt concentrations, there exist three N-glycans that can be attached to the glycoprotein.252 At high levels of NaCl, a glycan composed of 1–4 linked glucose subunits was attached to Asn-13 and Asn-498, while a glycan composed of galactose and idose subunits was attached to Asn-274 and Asn-279. At medium levels of NaCl, a pentasaccharide glycan containing hexose, hexuronic acid, and hexuronic acid methyl ester subunits, with a terminal mannose, was found attached to Asn-13 and Asn-83. Lastly, at lower concentrations of NaCl, a tetrasaccharide glycan with rhamnose, hexose, and sulfated hexose residues was observed at Asn-498.252
The complexity of different glycomes, in terms of both the structure and physiological roles, provides numerous incentives for a further development of new analytical tools, standards, and methodologies. While mass spectrometry continues to grow in importance, new roles have also been identified for its combination with ion mobility techniques. In addition, glycomic and glycoproteomic measurements based on advances in chromatographic columns and capillary electrophoresis are becoming increasingly useful to glycoscientists in their efforts to profile and quantify glycoconjugates from complex biological samples during large-scale studies. Glycoanalysis is perhaps more multimethodological than the other omics areas. Chemical derivatization of carbohydrate structures at microscale continues to potentiate more precise and sensitive measurements in both separations and MS measurements. A highly desirable merger between glycomics and glycoproteomics is being increasingly seen with the emergence of powerful data processing and interpretation. The pioneering applications of MS imaging to biological tissues indicate new avenues to knowledge about the distribution of biologically important glycoconjugates in different cells. The applications selected in this review underscore the rationale and needs for further methodological developments.
Financial support from the National Institutes of Health (Grant 5U01GM116248-02 to N.L.B.P. and Grants R01GM 106084 and R21GM118340 to M.V.N.), National Science Foundation (Grant CHE-1363313 to N.L.B.P.), and the Joan and Marvin Carmack Chair fund (graduate fellowship to G.N.) are gratefully acknowledged. Partial support from the Czech Grant Agency (Grant GACR 16-04496S) to M.V.N. has also been appreciated. We thank Helena Soini for her help in preparation of the manuscript.
Stefan Gaunitz earned his M.Sc. degree in Molecular Biology from Södertörn University, Sweden (2000). S.G. completed his Ph.D. studies at Karolinska Institutet, Sweden, under the supervision of Prof. Jan Holgersson and Dr. Anki Nilsson (2013). He conducted studies on recombinant mucins with tailored glycosylation and their ability to function as glycan inhibitors. S.G. then joined Prof. Lars Tjernberg’s group at Karolinska Institutet where he investigated novel potential glycan biomarkers of Alzheimer Disease (2013–2015). S.G. is currently working in in the laboratory of Prof. Milos V. Novotny at Indiana University, where he is developing novel glycomic and glycoproteomic protocols which assist the study of analytically challenging but biologically relevant glycans, such as biomarkers.
Gabe Nagy completed his bachelor of science (B.S.) degree at Creighton University. He is presently in the Ph.D. program at Indiana University under the guidance of Dr. Nicola L. B. Pohl. His research focuses on the development of mass spectrometry-based methods towards carbohydrate sequencing as well as liquid-chromatographic techniques for the purification of protected carbohydrates.
Nicola L. B. Pohl earned her A.B. degree from Harvard-Radcliffe College and her Ph.D. from the University of Wisconsin-Madison. After pursuing postdoctoral research as an NIH fellow at Stanford, she joined the faculty of Iowa State University in 2000. In 2012, she moved to Indiana University in Bloomington, where she currently holds the Joan and Marvin Carmack Chair in Bioorganic Chemistry. Her research meshes analytical, chemical biology, and synthetic methods to study carbohydrates, including the development of automated solution-phase-based oligosaccharide synthesis and of new mass-spectrometry-based methods for de novo carbohydrate sequencing.
Milos V. Novotny, a native of Brno, Czech Republic, received his education in biochemistry at the university of Brno/Masaryk University (Magister, RNDr, and DrSc) and postgraduate training in chromatography at the Czechoslovak Academy of Sciences. He emigrated in 1968, first to Sweden and then to the U.S., to work as a Research Associate at the Royal Karolinska Institute (Stockholm) and a Robert A. Welch Postdoctoral Fellow at the University of Houston, respectively. He joined the chemistry faculty at Indiana University (IU), Bloomington, in 1971. At IU, he is presently Distinguished Professor Emeritus and Adjunct Professor of Medicine. Professor Novotny is a recipient of more than 40 awards, medals, and other distinctions for his pioneering studies is separation science and bioanalytical chemistry, including four ACS awards, three honorary doctorates (Uppsala University, Masaryk University, and Charles University) and a membership in two foreign academies (Swedish Royal Society for Sciences and the Learned Society of Czech Republic). Professor Novotny has been active in development of glycoscience techniques since the late 1980s. He directed the National Center for Glycomics and Glycoproteomics (2004–2009) and still directs the Institute for Pheromone Research.
Milos V. Novotny: 0000-0001-5530-7059
The authors declare no competing financial interest.