|Home | About | Journals | Submit | Contact Us | Français|
Glycomics is the comprehensive study of all glycans expressed in biological systems. The biosynthesis of glycan relies on a number of highly competitive processes involving glycosyl transferase. Glycosylation is therefore highly sensitive to the biochemical environment and has been implicated in many diseases including cancer. Recently, interest in profiling the glycome has increased due to the potential of glycans for disease markers. In this regard, mass spectrometry is emerging as a powerful technique for profiling the glycome. Global glycan profiling of human serum based on mass spectrometry has already led to several potentially promising markers for several types of cancer and diseases.
The glycome, which is the glycan analog to the proteome and the genome, is loosely defined as the glycan components of a biological source. With the large diversity of glycans, vis á vis proteomics, it is not technically possible to obtain the complete representation of the glycome. Classifying the glycome relative to the genome, the proteome, and the metabolome is difficult because it is often bound to proteins directly correlated to the genome. However, glycans are produced by proteins making them more like metabolic products. In this regard, glycans are unique in that they link the three major areas of genomic, proteomic, and metabolomics since saccharides are found in all three groups. As disease markers glycans have major advantages: (1) Changes in glycosylation in disease states are supported by 50 years of glycobiology, particularly in the study of cancer. (2) N- and O-linked glycans are specifically small molecules and are therefore easy to quantitate like metabolites. Glycomics, therefore, holds considerable promise specifically as disease markers (1–3).
Proteins are often modified by the attachment of glycans during the normal synthesis of protein production. It is estimated that over 70% of all human proteins are glycosylated (4). Glycosylation is often found on cell surfaces and in extracellular matrices making it the first point of contact in cellular interactions (5, 6). Therefore, glycan biosynthesis is more significantly affected by disease states than by protein production. It is now well established that altered glycosylation varies significantly for cancer cells compared to normal cells (7, 8).
The large diversity of glycans makes a general discussion of all types of glycans and their glycoconjugates difficult. In human serum, the abundant glycoconjugates are glycoproteins. Therefore, only N- and O-linked glycans attached to glycoproteins are discussed in this review. Other glycoconjugates such as glycolipids, peptidoglycans, and glycosaminoglycans are not covered because they require different analytical technique. Glycomics represents a new paradigm for cancer biomarker discovery. This short review focuses on mass spectrometry methods for glycan-based disease markers.
In the glycomics approach, glycans are harvested and used to determine whether the glycosylation has changed in disease-state samples compared to healthy controls (9–13) without prior knowledge of the associated proteome. Advances in glycomics analysis rely on the analytical tools for glycomic profiling. Mass spectrometry provides a rapid and sensitive tool for component analysis. Mass spectrometry has therefore emerged as a central tool for glycomics analysis. It is also a precise tool for structural elucidation resulting in significant progress towards understanding the role of the glycome in many biological systems (14, 15). As the methods for profiling oligosaccharide composition and structures are becoming refined, the search for biomarkers is intensifying (16, 17).
Mass spectrometers capable of glycan analysis typically employ either matrix-assisted laser desorption/ionization (MALDI) or electrospray ionization (ESI) with or without chromatographic separation. High mass accuracy instruments such as modern time-of-flight (TOF), ion cyclotron resonance (ICR), and Orbitrap mass analyzers are useful for glycan detection because low or even sub ppm mass errors are often required for exact mass annotation. Lower performance instruments can make glycan analysis challenging by causing false assignments due to the high uncertainties associated with their measurements (18). Tandem MS can be used to remedy false assignments by providing structural information and glycan compositions. Structural elucidation can be obtained using glycan fragments, exact mass, and exoglycosidase digestion.
Although there are many groups studying glycans with mass spectrometry, there are some common features to their chemical processes. The workflows for glycan analysis are illustrated in Figure 1. The scheme is broken into seven tasks (listed in left side) that are almost universally implemented in some form in all of the methods.
Aside from proper study design and sample selection, the chemical work up of the samples and mass spectrometric analyses also have a large impact on the discovery and quality of glycan disease markers. Proper sample preparation improves sensitivity and reproducibility of the analysis. Glycans are generally isolated from glycoprotein containing samples by first cleaving off the glycans and then purifying them with solid phase extraction or high performance liquid chromatography (HPLC). Samples are often concentrated with vacuum centrifuges or lyophilizers prior to mass spectrometry analysis. The glycans may be derivatized (e.g., methylation) to improve ionization and stability of ions in the mass spectrometer.
Many of the biological sources of glycoproteins contain levels of salts that are too high for mass spectrometric analysis because the excess salt decreases the ionization of the desired analyte. Desalting is often accomplished by filtration (19, 20), dialysis (21, 22), cation exchange resin (23, 24), or carbon-based column adsorption (11, 13, 16, 18).
Methods for releasing the glycans from glycoproteins depend on the attachment of the glycans. N-linked glycans (N-glycans) attach via nitrogen on an asparagines, while O-linked glycans (O-glycans) attach via oxygen on a serine or threonine. Reductive beta elimination (10, 13) is most common for releasing O-glycans, while Peptide N-Glycosidase F (PNGase F) enzymatic cleavage (11, 18) is the most common for releasing N-glycans. One problem with the beta elimination reaction is the “peeling” reaction where the strong base cleaves off a monosaccharide from the glycan producing degraded structures. One alternative to the beta elimination reaction is the ammonia-based beta elimination reaction (22), which uses milder conditions. Alternatively, hydrazinolysis (21) can be used to release N-linked and O-linked glycans simultaneously but is rarely used due to the relative difficulty and hazards of the process. Hydrazine monohydrate substitutions have been used to decrease explosion danger of anhydrous hydrazine but were not as effective in the release of the glycans (25).
Purification of the glycan mixture brings its own challenges. Isolating the glycans from peptides or proteins present in the solution is necessary because these compounds ionize more readily than glycans and their presence in the mixture will severely suppress glycan signals. Purification methods take advantage of the differences in polarity between peptides and glycans. Large amounts of proteins can be precipitated from the solution using methanol (22, 24), ethanol (11, 13, 16, 18) or acetone (24). Lectin affinity enrichment has also been used to concentrate the glycoproteins prior to glycan release (20). C18 stationary phases (20, 26–30) are used to retain residual amounts of proteins. Amine (30), amide (30–32), and graphitized carbon (11, 13, 16, 18, 24, 26–29, 33, 34) media can be used to retain glycans for further elution into solution. Solvent removal is critical for sample concentration since the quantity of glycans is often too low for sensitive detection. Since increasing the temperature to remove the solution will degrade or destroy the glycans, vacuum centrifugation, lyophilization (21, 23, 29, 34) or bubbling dry nitrogen gas (22) through the sample, is implemented instead.
Mass spectrometric analysis of oligosaccharides can be performed in the native or derivatized state. The question of derivatization versus native prior to MS analysis is a matter of preference. Derivatization can increase sensitivity and decrease fragmentation. The latter is particularly important with fucosylated and sialylated glycans. Even with the “soft” ionization sources of electrospray and less “soft” MALDI, there is still impetus to modify the sialic acid for stability. However, derivatization appears to miss modifications on oligosaccharides such as sulfation. Furthermore, derivatization prevents the prefractionation by solid phase extraction (SPE). As sialylated glycans make up the most abundant species in human serum, they can suppress other less abundant components such as the high mannose glycans. Elution of the glycans from SPE with varying ratios of acetonitrile and water increases the number of glycans observed by mass spectrometry and allows better observation of less abundant yet important components such as the high mannose glycans. Nonetheless, a wide variety of chemical procedures can be employed to improve glycan ionization and ion stability in the mass spectrometers. Methylation (35), permethylation (23, 24, 26–28), and pyridylamination (25, 26) are examples of methods used to stabilize the sialic acid by forming esters. These methods have the additional benefit of positive ion detection of sialylated glycans concurrently with the neutral glycans.
Other chemical modifications can be useful for quantitating oligosaccharides. The reducing end of the glycans is the target for incorporating stable isotopes (23) that enable relative quantitation or fluorescent tags (22, 30–32) that are readily monitored during liquid chromatography.
Glycan mass profiling (GMP) is a method for analyzing the complete mixture by obtaining masses with little or no separation. This method yields the compositional profile with regard to the number of sialic acids, fucose, hexose, and N-acetylhexosamine. GMP provides rapid analysis based on mass and, therefore, composition; however, it cannot distinguish between isomeric species (compounds with identical mass and composition). Glycan mass profiling is usually performed with MALDI or ESI as the ionization source with only a crude method of glycan purification.
In our laboratory, glycan mass profiling is done using a matrix-assisted laser desorption/ionization Fourier transform ion cyclotron resonance (MALDI-FTICR) mass spectrometer equipped with 7.0 T magnet and infrared multiphoton dissociation (IRMPD) for tandem mass spectrometry. The FTICR mass analyzer, with its high resolution of 105–106 at full width half maximum and high mass accuracy of less than 10 parts per million (ppm) on internal calibration, allows us to profile the glycans using exact masses. IRMPD and collision-induced dissociation (CID) allows us to confirm the monosaccharide compositions based on the mass losses. The reproducibility of MALDI has been established in several studies, including one from our laboratory(11).
The extent of glycosylation and the types of O- and N-glycans have been shown to change in cancer (5, 7). The selection of one group for biomarker discovery over the other is, however, not a trivial choice. The release procedures and the analyses are distinct. Previous studies from this laboratory were focused on profiling O-glycans because the research shows more profound changes in these compounds. Indeed, a limited study on ovarian (13, 36) and breast (12) cancer show differentiation between disease and control, however it was found that the O-glycans were contaminated with a large number of more abundant N-glycans even when the release procedure was reported to produce predominantly O-glycans. The composition and the structures were confirmed by tandem MS including CID and IRMPD. Given that the changes were possibly N-glycans, we repeated these mass profiling studies specifically with procedures to release only N-glycans (11). N-linked glycans are readily identified by trimannosyl core (Man3GlcNAc2) and complex type N-glycans containing sialic acid were observed as major glycans in human serum (16, 18).
For N-glycans, putative structures can be obtained from the composition. We recently developed a theoretical library for the human serum N-glycome, which is highly effective for automatically annotating mass spectrum with a low false positive rate (18).
Until now, the focus on glycan marker discovery has involved glycan mass profiling. Individual oligosaccharides (specific isomeric structures) may provide more robust glycan markers with higher specificity than composition alone. Changes in specific linkages have been attributed to some diseases (37, 38). The number of isomers greatly surpasses the number of compositions, providing a significantly larger set number of potential glycan markers (16, 28). To gain access to individual glycans, separation of the isomers is required. Analytical methods for separating components to isomers, such as nanoflow liquid chromatography (16, 28, 39) and capillary electrophoresis (40, 41), coupled with mass spectrometry have proved highly useful. A method to assess the diversity of the N-glycans in human serum without derivatization has been developed by our laboratory employing porous graphitized carbon as stationary phase (16). Glycans of individual serum samples were profiled and compared. Complex type N-linked glycans were the most abundant glycans in human serum accounting for ~96% of the total glycans indentified, while hybrid and high mannose type glycans comprise the remaining ~4% . The ability to separate and simultaneously analyze neutral and anionic N-glycans from human serum without derivatization demonstrates a rapid yet highly sensitive and highly specific tool for disease biomarker discovery.
There are now several recent publications that focus on N- and O-glycans as disease markers (11–13, 27, 28, 34, 42–53). Alterations in the degree of branching and levels of sialylation and fucosylation in N-glycans have been reported as a consequence of diseases (1, 54). Similarly, changes in O-glycans in mucins have also been associated with diseases (50, 54). In cancer, changes in the branching of N-glycans, truncation of O-glycans, changes in the amount, linkage, acetylation and expression of sialic acids have all been suggested to signify the diseases(54).
Table 1 shows a summary of some of the potential N- and O-glycan markers for cancer gathered from literature. The table is by no means complete, but it provides a limited overview and illustrates the potential of glycans as markers for cancers. The biological sources were mostly sera and cell lines, but many other tissue samples have been examined. In prostate cancer, for example, there are differences in mannosylation and fucosylation, and specifically complex N-glycans were found to vary with the disease (11, 44, 51). In breast cancer, a fucosylated N-glycan was found as a potential marker but there was also an increase in intensity in the high-mannose N-glycans and fucosylation in O-glycans (27, 31, 50, 53). Fucosylation and sialylation levels show conflicting results in studies (27, 53). In liver cancer, fucosylation, whether in the core or outer arm, tends to be elevated (28, 48). The same increase in levels of fucosylation can be seen in ovarian and pancreatic cancer (46, 47, 49). Although there are some conflicting results in some of the markers found in cancer, e.g. some studies show an increase and others a decrease in mannosylation, there are clearly distinct differences in the levels of N- and O-glycans.
Glycan markers hold considerable opportunities and challenges for disease diagnosis because the glycosylation machinery is highly sensitive to the biochemical environment. Mass spectrometry-based glycomics has the potential to provide a single platform to monitor several diseases simultaneously. The main glycomics method currently involves mass mapping strategies where composition profiles are obtained. However, structure and protein-specific glycomics has yet to be explored. These strategies will require further analytical development. The serum N-glycome will need to be annotated, and glycoproteomic approaches need to be developed that describe both protein and glycan simultaneously. Furthermore, as the strategies become more refined and the sample sets become larger the glycomics approach for biomarker discovery will not only impact diagnosis of disease, but will also provide a new paradigm for understanding the role of the glycome in many biological areas.
We are grateful for funds provided by the National Institute of Health (RO1GM049077), the Ovarian Cancer Research Fund, and the Ann Garat Fund.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.