|Home | About | Journals | Submit | Contact Us | Français|
The biological functions of glycoconjugate glycans arise in the context of structural heterogeneity resulting from non-template driven biosynthetic reactions. Such heterogeneity is particularly apparent for the glycosaminoglycan (GAG) classes, of which heparan sulfate (HS) is of particular interest for its properties in binding to many classes of growth factors and growth factor receptors. The structures of HS chains vary according to spatial and temporal factors in biological systems as a mechanism where by the functions of the relatively limited number of associated proteoglycan core proteins is elaborated. Thus, there is a strong driver for the development of methods to discover functionally relevant structures in HS preparations for different sources. In the present work, a set of targeted tandem mass spectra were acquired in automated mode on HS oligosaccharides deriving from two different tissue sources. Statistical methods were used to determine the precursor and product ions, the abundances of which differentiate between the tissue sources. The results demonstrate considerable potential for using this approach to constrain the number of positional glycoform isomers present in different biological preparations toward the end of discovery of functionally relevant structures.
Heparan sulfate (HS) is a glycosaminoglycan (GAG) that consists of repeating disaccharides units containing variable modifications. HS is present on all animal cell surfaces and directly interacts with myriad of extracellular signaling molecules. It is required for embryonic development  and for the functioning of every adult physiological system . The interactions between many families of growth factors and growth factor receptors are modulated depending on the structures of HS expressed on cell surfaces and extracellular matrices. Thus, it is not surprising that HS structural biochemistry is central to understanding of disease mechanisms including tumor growth, angiogenesis, amyloid deposition, tissue remodeling and repair, and host-pathogen interactions.
Biosynthesis of HS begins in the endoplasmic reticulum with the addition of a tetrasaccharide linker that is attached to a proteoglycan serine residue via a xylose monosaccharide. The saccharide chains are extended and subsequently modified by a series of enzymes in the Golgi apparatus. Nascent HS chains consist of repeating disaccharide units of glucuronic acid (GlcUA) and N-acetylglucosamine (GlcNAc) [4GlcAβ1-4GlcNAcβ1-] that undergo a series of modification reactions. The chains are first acted upon by N-deacetylase/N-sulfotransferase (NDST) enzymes that remove N-acetate groups and replace them with N-sulfate groups. Heparins, expressed in connective tissue mast cell granules, are a class of HS in which the NDSTs act to modify nearly all of the GlcNAc residues in the chain. In most other tissues, the NDSTs create HS chains with domains of high degree of N-sulfation and those with a high degree of N-acetylation and those with intermediate content. Following the actions of the NDSTs, the chains undergo O-sulfation and/or uronic acid epimerization. Such modifications occur at specific sites. The most common disaccharide of heparin is 4IdoUA(2S)α1-4GlcNS(6S)α1, where 2S = 2-O-sulfation and 6S = 6-O-sulfation. Addition of a sulfate to position 3 of the glucosamine is a rare but biologically significant modification. The general structure of HS from most tissues differs from that of heparin in that unsulfated and intermediate sulfation domains are present.
HS chains are heterogeneous by nature and expressed in a spatially and temporally regulated manner. Thus, biomedical investigations require HS structural information from samples isolated from specific tissues and disease states. The analytical challenge is to determine the structures of HS chains, given their inherent heterogeneity of composition with respect to biosynthetic modifications. For a given composition, it is possible for a set of positional isomers to be present, and this must be taken into consideration. Analytical strategies for HS analysis include (1) chemical or enzymatic partial hydrolysis of labeled or unlabeled HS [3, 4]; (2) separation using liquid chromatography modes including size exclusion, strong anion exchange, reverse-phase, reversed phase ion-pairing chromatography or capillary electrophoresis ; (3) comparative detection between the different conditions of hydrolysis. If each strategy adopted in the analysis of HS is a source of information concerning the structure of such compounds, the use of complementary strategies is necessary to obtain complete sequence information.
Two approaches used widely for the structural elucidation of HS are nuclear magnetic resonance (NMR) spectroscopy  and mass spectrometry (MS) [7–10]. The use of NMR provides detailed information regarding uronic acid isomers and positions of sulfate and acetate groups. The major limitation of this technique is the large amount of starting material required for analysis. Furthermore, NMR has been used primarily to characterize smaller oligomers, limited typically to hexasaccharides. Tandem MS has been proved to be a useful, highly reproducible and very sensitive means for the characterization of GAG disaccharides [11–17]. A typical approach to identify and quantify HS oligosaccharide isomer mixtures was developed using tandem MS and consisted of comparison of the product ion profiles of unknown isomer mixtures with those acquired from pure isomer standards [14, 18–22]. Selected fragment ions of the pure standards were observed to be diagnostic for the abundances of the isomeric compounds in the mixture. A correlation coefficient was calculated for each fragment ion of the standards and a system of equations allowed calculation of the proportion of each standard in the sample . Pure oligosaccharide standards are available for chondroitin sulfate (CS) GAGs [14, 15, 19, 21–26] but unfortunately not for HS [13, 16]. In addition, because the compounds are typically present in isomeric mixtures when obtained from biological sources, such methods report the mixture percentages of the standard isomers and are incapable of determining complete sequence information. During a collisionally activated dissociation (CAD) tandem mass spectrometry experiment, the sulfate groups of HS chains tend to undergo dissociation at lower vibrational energies than do glycosidic bonds . As a result, tandem MS experiments that involve significant vibrational excitation result in uninformative fragmentation of the sulfate groups. The extent to which this occurs is minimized as precursor ion charge states increase, and may also be minimized with the use of cations to pair with the negatively charged sulfate groups [27–29]. Tandem MS sequencing of HS oligosaccharides using CAD, electron capture dissociation, electron detachment dissociation or infrared multiphoton dissociation [29–33] is straightforward because of the competing fragmentation channels involved during the experiment. Post fragmentation, four types of fragment ions are observed in the tandem mass spectra of HS: loss of HSO4−, loss of neutral SO3, glycosidic bond cleavage and cross-ring cleavage . Although glycosidic bond and cross-ring cleavages contain the most useful structural information, interpretation of the tandem MS is complicated by the presence of the accompanying SO3 losses. Under ideal conditions, complementary pairs of glycosidic bonds or cross-ring cleavages provide definitive structural information for the complete molecule; however, such pairs are often not observed. Furthermore, for most biological samples a mixture of HS isomers is present for a given oligosaccharide composition, and this complicates the product ion spectra.
In addition, the concept of direct sequencing is not applicable to HS samples that consist of mixtures of isomers. Although it is possible to reduce mixture complexity using multi-step chromatographic purification, this effort is only justified for a sample found to have significant biological activity. For the purpose of discovery and analysis of oligosaccharides of biological and/or therapeutic interest for HS/heparin preparations, it is necessary to interpret information produced directly from samples containing isomeric mixtures. Therefore, an appropriate method for interpretation of such data is essential.
In proteomics, tandem mass spectrometric data are used to search databases generated from genomic information. For such searches, homogeneity of the peptides is generally assumed in order to produce a sequence that matches the product ions observed. For glycomics analysis, product ion m/z and charge state values define monosaccharide compositions and the ion abundances correlate with the type of oligosaccharide isomer. Thus, it is necessary to consider product ion m/z values, charge states and abundances in interpreting the data. While the comparison of tandem MS data to databases or libraries using bioinformatics tools allows facile identification or sequencing of peptides [34–37], the process differs markedly for glycomics. Bioinformatics tools [38–42] as software of interpretation of MS and tandem MS data through different database are available for carbohydrates and GAGs [43, 44] but do not allow for the interpretation of isomeric mixtures. As a result, we have undertaken an effort to meet this need.
Two approaches commonly used to compare raw mass spectral data are clustering analysis and principal components analysis. Cluster analysis consists of dividing data into groups (clusters) in order to capture the natural structure of the data. One of the first references to clustering in the mass spectrometry field was for comparison of alkyl thiolesters and pharmaceutical products [45, 46]. The cluster analysis was used here to help the interpretation of tandem data and classify them in cluster representing the different classes of samples. Clustering analysis of tandem MS data was applied in the proteomics field for the following purposes: (1) to reduce the number of tandem mass spectra used in the identification of proteins by regrouping similar tandem data to decrease the redundancies of the analyses [47–49]; (2) to improve the understanding of fragmentation patterns (fragmentation vs. intensity) and open the area of improved protein identification algorithms  and (3) in label free quantification experiments for the discovery of biomarkers [51–54].
We have developed a novel fully automated approach based on the new interpretation of tandem MS data of HS oligosaccharides extracted from different organ tissues. We applied this approach to a set of data acquired using an automated CAD tandem MS acquisition parameters. Tandem MS data on 13 targeted precursor ions were acquired on HS oligosaccharides extracted from each of two bovine tissues (aorta, lung) automatically in quadruplicate. The size of the corresponding data set was approximately 1000 features (product ions), for which manual comparison was not feasible. A strategy based on software and bio informatics tools used commonly in the proteomics and biomarkers field was developed. Agglomerative hierarchical clustering (AHC) was used on the tandem MS data to demonstrate that sufficient information for differentiation of isomeric glycoforms in the four organs samples was present. The analysis was useful for recognition of fragmentation patterns corresponds to organ-specific HS structures.
Heparin lyase III from Flavobacterium heparinum was purchased from Ibex (Montreal, Canada). Heparan sulfate samples from bovine organs aorta (A) and lung (L) were a generous gift from Dr. Keiichi Yoshida and were prepared as described previously . The extracted HS from bovine aorta and lung were depolymerized to completion using heparin lyase III as previously described . Oligosaccharides were then fractionated using a Superdex Peptide 3.2/30 size exclusion column (GE Healthcare) equilibrated with 50 mM ammonium acetate using UV detection at 232 nm. The HS oligosaccharides present in fractions 15 from each tissue (Supplemental Figure 1) were analyzed using tandem mass spectrometry as described below.
A ThermoFisher Scientific (San Jose, California) LTQ Orbitrap Discovery mass spectrometer coupled with a Triversa Nanomate system (Advion Biosystems, Inc., Ithaca, NY) was operated in negative ion mode. The Orbitrap ion optics were tuned to eliminate the fragmentation of the HS compounds during the ejection of ions from the ion trap and subsequent transmission to the C-trap. The temperature of the transfer line was set at 150°C. The Triversa Nanomate allowed automated acquisition of MS and tandem MS data.
The following nomenclature is used throughout to define the compositions of HS oligoaccharides: [A,B,C,D,E] represents the number of units in the following composition: [ΔHexA, HexA, GlcN, SO3, Ac]. For example, the composition [1,2,3,4,1] is constituted by 1 unsaturated hexuronic acid, 2 hexuronic acid, 3 glucosamine, 4 sulfate groups and one acetate group. The Domon-Costello nomenclature  was used to describe the fragmentation pattern of HS compounds (Supplemental Figure 2). For the tandem MS product ions from GAG oligosaccharides, the number of sulfate (S) and acetate (Ac) groups is given in parenthesis after the product ion, i.e. Y2 (S, Ac).
Proper control of the electrospray source conditions was determined to be essential for generation of high quality tandem mass spectra from HS oligosaccharides . In summary, if the ion charge state was higher than the number of sulfate groups of the oligosaccharides, the tandem mass spectra were rich in structural information. Otherwise, the tandem mass spectra contained abundant ions produced from the dissociation of sulfate groups. The charge state of the precursor ion is dependent on the nano-electrospray conditions. We observed that by dissolving the HS oligosaccharides 50% methanol, a combination of high ion charge state and stable nano-electrospray ions were obtained. The Triversa Nanomate method was programmed for 5 minutes of acquisition with the following parameters: pressure N2:0.7 PSI, voltage −1.3 kV. The nano-electrospray mass spectra showed the presence of 13 precursor ions common to all tissue types. They are represented here using the nomenclature described above: [1,0,1,1,1]−2 (m/z 228.5), [1,0,1,2,0]−3 (m/z 247.5), [1,1,2,3,1]−4 (m/z 238.0), [1,1,2,2,0]−3 (m/z 277.3), [1,1,2,2,1]−3 (m/z 291.0), [1,1,2,3,0]−3 (m/z 303.7), [1,1,2,3,1]−3 (m/z 317.7), [1,1,2,2,0]−2 (m/z 416.1), [1,1,2,2,1]−2 (m/z 437.1), [1,2,3,4,1]−5 (m/z 273.6), [1,2,3,3,1]−3 (m/z 317.7), [1,2,3,3,1]−4 (m/z 322.3), [1,2,3,4,1]−4 (m/z 342.3) and [1,2,3,3,1]−3 (m/z 430.1).
The method was as follows: one Orbitrap full scan (m/z 200–1,000 with RP 30,000 at m/z 400, 5×104 target value, 500 ms maximum injection time, 1 microscan) was followed by 4 CAD (RP 15,000 with 2×104 target value, 1,000 ms max inject time, 20 microscans, window of isolation 3, Energy 30) detected in the Orbitrap on the 1st, 2nd, 3rd and 4th most abundant ions detected in the corresponding full scan. Tandem and MS spectra were acquired in profile mode. A list of the 13 common precursor ions was obtained and used for the data acquisition. The exclusion time was 300s and the method was run for 5 min. The samples were acquired in quadruplicate technical replicates to allow statistical treatment. The designated precursor ions are shown in Supplemental Figure 1b,c.
An overview of the data analysis approach is shown in Figure 1. The set of raw data files was processed using Proteome Discoverer version 1.1 (ThermoFisher Scientific) and converted to the Dta file format . The Dta file names were changed manually to simplify the manipulation and of the content of the file: Type of Tissue (Aorta/Lung)_Replicate_(1/2/3/4)_Precursor ion.dta. The Dta file set was then analyzed using Progenesis MALDI version 1.2 (Nonlinear Dynamics, UK). Using Progenesis MALDI we performed spectral alignment and normalized the Dta mass lists across the complete data set of variance. The mass accuracy of the data was reduced to 0.1 Da via binning which did not affect the resulting clustering analysis. Each product ion present in tandem mass spectra from the different tissue type was considered to be an independent feature. Analysis of variance (ANOVA) performed on the product ions measured across complete data set with p-values <0.05 indicated significant changes across the different groups. A total of 78 daughter ions were considered statistically different, with p-values <10−16 (Supplemental Table 1).
Primary statistical analysis using agglomerative hierarchical clustering (AHC) was performed using Progenesis MALDI. AHC is one of the most common statistical tools used to define the degree of similarity/dissimilarity between objects and groups. It allows for iterative grouping or segmentation of objects, here precursor ions or product ions, into clusters to define the proximity of the objects. A distance measurement was defined for each AHC. Further statistical analyses using AHC and manual confirmation were performed using Microsoft Excel 2010. The data were clustered using both similarity and dissimilarity methods. Similarity clustering across precursors was performed to see if the precursors could be grouped dependent upon the origin of the tissue. Pearson’s correlation was used, and varied from 0 to 1 with 0 indicating the variables were uncorrelated. For dissimilarity, it was of interest to determine which product ions of the precursor ions differed most significantly in abundance. The clustering was realized using the Euclidean distance method.
Interpretation of the tandem mass spectra of a single composition yields significant but incomplete structural information. The main reason is that a single composition generated by lyase III depolymerization yields a complex mixture of positional isomers. Additionally, the complexity of the MS2 data is increased by losses of labile sulfate groups from the precursor and product ions obtained during CAD fragmentation. The loss of sulfate from the precursor ions and product ions is dependent on the number of sulfate groups and the charge state of the ions involved. As the charge state of a precursor ion increases, the repulsion force between the sulfate group increases, thus favoring glycosidic bond dissociation or cross-ring cleavages [13, 16, 21, 60]. In our experience, when the charge state is lower than the number of sulfate groups in the precursor ion, the tandem mass spectrum is dominated by loss of SO3.
Representative tandem mass spectra of the compounds [1,2,3,4,1]3− (m/z 456.7) and [1,2,3,4,1]4− (m/z 342.3) obtained from bovine aorta tissue are shown in Figure 2. Each HS composition is represented by the numerical code [A, B, C, D, E] corresponding to the number of different of the following carbohydrates [ΔHexA, HexA, GlcN, SO3, Ac]. The tandem mass spectrum of the [1,2,3,4,1]−3 (m/z 456.7) (a) shows abundant product ions from loss of SO3 from the precursor and is therefore nearly devoid of structurally informative ions. In contrast, the tandem mass spectrum of [1,2,3,4,1]−4 (b) is rich in information and contains product ions that clearly identify the position of the acetate group. The product ions B1 and [M-0,2X] indicate that the first hexuronic acid at the non-reducing end does not carry a sulfate group. The complementary product ions Y2(1S,Ac)/B4(3S) and the product ions 0,2A6 (4S) unambiguously localize the acetate group to the first reducing end of the glucosamine residue. The product ion (m/z 574.0792) illustrated another complexity of the tandem data: it could be assigned as B3(1S) or 0,2A3(1S,Ac) product ions with the same error (0.8ppm). The ambiguity could be solved using MS3on this ion, which is not possible here due to low ion abundance. For either interpretation, the data indicate the presence of an isomer with acetate at the glucosamine in position 5, adjacent to the non-reducing end. The isomer with the acetate located at glucosamine 1 (reducing end) is dominant compared to the isomer with acetate at glucosamine in position 5. The relative intensity of the product ions indicating the acetyl group localized on glucosamine 5 is less than 5% while the product ions showing the presence of the acetate at glucosamine 1 ranges between 10% and 70%. A simple combinatorial calculation yields 378 possible isomers for the glycan composition [1,2,3,4,1]. The restriction posed by the predetermined location of the acetate group and by the lack of a sulfate group at the first hexuronic acid at the non-reducing end decreases this number to 70 possible combinations.
Conclusive determination of the location of the sulfate group on the chain is further complicated by the following: product ions may also be generated from sulfate losses from more highly sulfated precursors. This is indicated by the presence of the product ion 0,2A6(3S), [M-0,2X-S] and Y5 (3S) in the tandem mass spectrum of the compound [1,2,3,4,1]4− at 100%, 20% and 35% relative abundances (2b). Loss of sulfate from the product ions becomes more pronounced as the size of the oligosaccharide and the number of sulfates increase and modeling this mechanism is challenging. One way to counter the problem is to compare directly the pattern of fragmentation for an unknown precursor ion with that of a pure standard of known structure [13–16, 19, 22–26] and define the proportion of isomers present. Alternatively, the separation of individual isomers in the mixture may be performed to go further in the interpretation of the tandem mass spectrometry data; however, this is extremely difficult from a chromatography point-of-view and only justified for target oligosaccharides with known high value biological activities. It is therefore appropriate to develop an informatics approach to facilitate structure-function correlation from complex mixtures.
Developing methods for global characterization of the differences in isomeric structures of HS preparations for the purpose of biomarker discovery [61–63] or for analysis of therapeutic HS preparations  is a primary focus for the present work. Regarding biomarker discovery, it is important to consider the need to screen a large number of compounds in a single experiment. Additionally, characterization of the isomerization of GAGs is crucial in determining their biological function [65–69]. The goal here, using a statistical classification approach with AHC, is to establish a classification-map of tandem MS data that will clearly afford detailed isomeric determination and classification of the HS oligosaccharides dependent upon tissue type and precursor ion m/z. The map would thus yield an exact representation of the tandem mass spectrometric data in terms of similarity of the precursor ions and would thus represent the dataset of features in both the MS and tandem MS space.
We performed two exploratory avenues of data processing using AHC. The first involved similarity clustering based upon precursor m/z to determine if it was feasible to differentiate HS obtained from lung and aorta and to determine the degree of similarity or difference in isoforms of identical m/z precursor ions. The second approach involved application of AHC using dissimilarity measurements on the tandem mass spectra ion series for each unique cluster obtained in the similarity space. The goal was to identify the product ions different in abundances between tissue types and to determine fine structure underlying these differences.
Agglomerative hierarchical clustering was performed on the tandem mass spectra obtained from the most prominent 13 precursor ions obtained from the lyase digest of HS obtained from bovine lung and aorta tissues. Results of similarity clustering of the 13 precursors are summarized in Figure 3 and demonstrate that all precursor ions clustered dependent upon the origin of the tissue and that precursors with the same m/z from lung and aorta clustered together. We observed that considerable variability in the distance measurement existed in the clustering of the precursors, ranging from 0.99, indicating near identical tandem mass spectra, to 0.573, indicating the presence of either novel ions in the tandem mass spectra or differences in ion intensities originating from the different tissues. A complete summation on our observations on similarity reflecting the degree of clustering of the precursors are presented in more detail below and summarized in Table 1.
The reproducibility of tandem mass spectra is mirrored by the degree of clustering. We observed that if the replicate analyses of each precursor ion were reproducible, the tandem mass spectra produced from the same tissue were observed to cluster together. For the 13 precursor ions studied, 10 demonstrated tight clustering indicating acceptable reproducibility: [1,2,3,3,1]−4 (m/z 322.3), [1,2,3,3,1]−3 (m/z 430.1), [1,2,3,4,1]-4 (m/z 342.3), [1,1,2,2,0]−2 (m/z 416.1), [1,1,2,2,0]−3 (m/z 277.3), [1,1,2,3,0]−3 (m/z 303.7), [1,1,2,2,1]−2 (m/z 437.1), [1,1,2,3,1]−3 (m/z 317.7). We observed that two precursors ions [1,0,1,1,1]−2 (m/z 228.5) and [1,1,2,2,1]−3 (m/z 291.0) clustered well except for a single replicate. The ion intensity of this replicate in both cases was significantly lower than the other replicates. The tandem mass spectra of [1,1,2,2,1]−3 (m/z 291.0), shown in Figure 4, illustrates this observation. While the total ion current of aorta replicate 3 was on the order of 5E+4 counts, the other replicates were on the order of 1E+6 counts (4a). Furthermore, only aorta replicate 3 contains evidence of the precursor ion M (m/z 291.0).
We observed three precursor ions [1,2,3,4,1]−5 (m/z 273.6), [1,0,1,2,0]−3 (m/z 247.5), and [1,1,2,3,1]−4 (m/z 238.0) which did not cluster ideally. In general for these compounds, the reproducibility of the tandem mass spectra was poor due to a lack of sensitivity in the tandem mass spectra (see for example Supplemental Figure 3) and low signal-to-noise ratio. The fact that the most intense ions were not constant from one replicate to the other perturbs the clustering analyses. This perturbation was also observed as higher P values in the original ANOVA analyses of these tandem MS clusters. Poor reproducibility in tandem mass spectra ultimately is reflected in ANOVA with P values > 0.05. The lack of reproducibility of the [1,0,1,2,0]−3 (m/z 247.5) resulted from co-isolation of precursor from the compound [1,1,2,4,0]−4 (m/z 247.59). Thus AHC proved to be an effective measure to indicate the reproducibility or the lack thereof of the tandem mass spectrometry data.
AHC was performed across the precursor ions in order to determine which may be considered as good candidates for differentiation of HS obtained from aorta and lung tissues. For this type of clustering, distance measurement affords a relative-quantitative measure of the differences between the tandem mass spectra observed within this dataset and ranged from values of 0 to 1. A value of 1 indicates that the tandem mass spectra are nearly identical. In Table 1 the distance measurements are shown for the 10 precursor ions that demonstrated ideal clustering dependent upon their origin and yielded 3 distinct groups of clustering.
The first group contained 2 precursors, the component [1,1,2,3,0]−3 (m/z 303.7) and [1,1,2,2,1]−3 (m/z 291.0) which had a distance measurement equal to 0.99 indicating that the spectra were nearly identical. This is illustrated in Figure 4 which shows the tandem mass spectra of the precursor ion [1,1,2,2,1]−3 (m/z 291.0). Both components [1,1,2,2,1]−3 (m/z 291.0) and [1,1,2,3,0]−3 (m/z 303.7) (not shown) from both tissues were identical in terms of glycoform composition and were thus not good candidates to distinguish HS from the tissues.
The second group was composed of 4 precursors, the components [1,1,2,2,0]−2 (m/z 416.1), [1,1,2,2,0]−3 (m/z 277.3), [1,1,2,2,1]−2 (m/z 437.1) and [1,1,2,3,1]−3 (m/z 317.7). Their distance measurements of the similarity clustering ranged between 0.969 and 0.937 indicating that a small degree of difference existed between the tandem mass spectra obtained from lung and aorta tissues. Manual comparison of the tandem mass spectra of the compounds [1,1,2,2,0]−2 (m/z 416.1), [1,1,2,2,0]−3 (m/z 277.3) and [1,1,2,3,1]−3 (m/z 317.7) did not indicate noteworthy differences. Tandem mass spectra of component [1,1,2,2,1]−2 (m/z 437.1) indicated some noticeable differences in the product ion intensities between the aorta and lung tissues including: 0,2X2 (1S) (m/z 300.0), B3 (2S) (m/z 326.5) and Y3 (2S,Ac) (m/z 358.0). These are illustrated in Figure 5. We also observed the co-isolation of the compound [1,3,4,5,0]−4 (m/z 436.0386) (−2.4 ppm) in the tandem mass spectra from the lung tissue (5b). The relative intensity of this component was less than 5% of the intensity of the target compound and contributed minimally to the tandem mass spectra, therefore it did not contribute to the intensity of the product ion Y3 2SAc (m/z 358.0). The component [1,1,2,2,1]−2 (m/z 437.1) is consistent with presence of different positional isomers in aorta vs. lung tissue. Of note is the fact that the product ion pattern for precursor ion [1,1,2,2,1]−3 (m/z 291.0) was similar for the lung and the aorta. Charge state influences product ion pattern, and it is therefore not surprising that the degree to which diagnostic product ions depends on charge state; however, the charge state distribution differs depending on the tissue type, consistent with the conclusion that tissue-specific isomers influence charge state.
The third group contained 4 precursor ions, [1,2,3,3,1]−4 (m/z 322.3), [1,2,3,3,1]−3 (m/z 430.1), [1,2,3,4,1]−4 (m/z 342.3) and [1,0,1,1,1]−2 (m/z 228.5). These were present with distance measurements ranging from 0.573 to 0.797. These distance measurements were higher in value than for the compounds previously described because they arose due to the most abundant product ions, resulting from significantly different isomer composition rather than minor differences in the less abundant product ions. The tandem mass spectra of precursor ion [1,2,3,3,1]−4 (m/z 322.3) are presented in Figure 6 and indicate that the most intense product ion is Y5 (3S,Ac) (m/z 377.5) for the lung tissue and 0,2A6 (3S) (m/z 396.4) for the aorta tissue. The difference between the tandem MS data is due to intensity and not the appearance of new product ions. The verification of the window of isolation is important to be sure that the difference between tandem mass spectra from the 2 tissues is due to HS isomers specific to lung versus aorta and not from a co-isolation of unrelated compounds. Figure 6 indicates the precursor ion [1,2,3,3,1]−4 (m/z 322.3) are perfectly isolated for the aorta tissue. The window of isolation of the lung tissue contains a co-isolation of the component [0,2,2,3,1]−3 (m/z 323.6901) which accounted for less than 5% of the signal. The ions Y5(3S,Ac) (m/z 377.5) and 0,2A6(3S) (m/z 396.5) could not be assigned to any product ions of the structure [0,2,2,3,1]−3 (m/z 323.6901), consistent with the conclusion that the differences observed between the tandem mass spectra of the component [1,2,3,3,1]−4 (m/z 322.3) were significant and reflected the presence of different glycoforms in the both tissues. By contrast, the compound [1,1,2,3,1]−4 (m/z 228.5) clustered properly between lung and aorta tissue but this observation is due to a co-isolation of the compounds [1,1,2,3,0]−4 (m/z 227.5) at different intensity in the lung (<30%) and the aorta (>70%) tissue.
The results summarized in Table 1 indicate the isolation of the compositions [1,2,3,4,1]−4 (m/z 342.3) and [1,2,3,3,1]−3 (m/z 430.1) were perfectly clean. This ideal clustering of both compositions from aorta and lung is consistent with the presence of different positional isomers in the lung versus aorta tissues. We observed (data not shown) that the most significant differences in intensity for the product ions for the aorta and lung tissues were 0,2A6 (3S) (m/z 396.4)//[M-S] (m/z 322.8) for the component [1,2,3,4,1]−4 (m/z 342.3) and B4 (2S) (m/z 407.0)//[M-S] (m/z 403.4) for the component [1,2,3,3,1] −3 (m/z 430.1). The observations are consistent with the conclusion of the existence of positional isoform differences between HS obtained from the lung and aorta tissues.
We hypothesized that the primary reason the components clustered together was due to a common structural core. Seven common product ions were clearly identified at different intensities between lung and aorta HS: B1 (m/z 157.0), Y5 (3S,Ac) (m/z 377.4), 0,2A6 (3S) (m/z 396.4), B4 (2S) (m/z 407.1), C4 (2S) (m/z 416.1), C5 (1S,Ac) (m/z 476.1), 0,2A3 (1S,Ac) (m/z 574.1). The presence of 0,2A6 (3S) (m/z 396.4) and the product ion 0,2A3 (1S,Ac) (m/z 574.1) clearly indicates the common HS core contains the acetate group and it could be identified for both those compounds at the first and fifth glucosamine residue of the oligosaccharide. While this core structure containing the sulfate compounds is plausible, our current study does not define this more precisely. To fully elucidate the structures of these compounds and characterize all possible compositional isoforms including position of sulfates that may give rise to diagnostic ions, it will be necessary to isolate the individual components by some means of separation. One possible solution to provide separation space that would afford sufficient resolution to solve this challenging problem is ion mobility mass spectrometry, an area of research that has been recently applied to the characterization of HS .
AHC using dissimilarity measurements was performed on the tandem mass spectra ion series for each unique cluster obtained in the similarity space in order to determine which product ions were diagnostic for tissue type for each precursor m/z. The goals were to determine fine structure in homology and feature differences in product ion mass spectra from near neighbor clusters that would be indicative of structural homology and to determine the major differences in product ions resulting from different distributions of positional isomers of potential use as diagnostic marker ions. The distance measurement of AHC by dissimilarity is represented in this case as follows: a value of 1 indicates that the product ions differed significantly between tandem mass spectra of the same precursor ions while a value of 0 indicates homology.
Example results of dissimilarity clustering are provided in Figure 7, which illustrates the clustering of the 3 general groups described above; (a) for precursor ion [1,1,2,2,1]3− (m/z 291.0), no major differences in product ion abundances between lung and aorta tissue, (b) for precursor ion [1,1,2,2,1]2− (m/z 437.5), slight differences in product ion abundances and (c))[1,2,3,3,1]4− (m/z 322.3), major differences in product ion abundances for HS oligosaccharides obtained from lung and aorta tissues. In Figure 7a depicting the tandem MS clustering of the component [1,1,2,2,1]−3 (m/z 291.0), we observed no major differences between the groups which was observed with a dissimilarity measurement of less than 0.05. The only feature that showed a difference, distance measurement equal to 1, was the most intense product ion in the tandem mass spectra, 0,2A4 (2S). We observed this perturbation in distance measurements consistently throughout the application of AHC when the product ion was the most intense in the tandem mass spectra.
In the second group, represented here in Figure 7b depicting the tandem MS clustering of the component [1,1,2,2,1]−2 (m/z 437.1), we observed some differences between the aorta and lung groups. Different clusters of product ions were observed. The first one included the most intense ion [M-S] (m/z 397.1) with a distance measurement equal to 1. The second cluster included the features Y3 (2S,Ac) (m/z 358.0) and B3 (2S) (m/z 326.5) with a distance measurement equal to 0.13. The last clustering with a distance measurement of < 0.13, is represented by the group of product ions B3(1S) (m/z 574.1), M (m/z 437.1), 0,2A4 (S,Ac) (m/z 346.7) and 0,2A4 (2,S) (m/z 386.6). The features Y3 (2S,Ac) (m/z 358.0) and B3 (2S) (m/z 326.5) constituted the diagnostic ions and represent the difference isomeric glycoforms present in the aorta and lung tissues. The distance measurement of 0.13 indicates a relatively moderate difference between the sets of tandem mass spectra.
In the third group, presented in Figure 7c depicting the tandem MS clustering of the component [1,2,3,3,1]−4 (m/z 322.3), major differences were observed between the aorta and lung groups. Here, two product ions Y5 (3S,Ac) (m/z 377.5) and 0,2A6 (3S) (m/z 396.4) clustered together with the distance measurement equal to 1. Both features clearly indicated the presence of different positional isoforms in HS from lung and aorta tissues and their variation of intensity between lung and aorta were significant. The second group also contained the first sub cluster C4 2S (m/z 416.2).
In the biological context, glycoconjugate glycans exist as mixtures of glycoforms and positional isomers. Thus, using assumptions relating to known activities of biosynthetic enzymes, the m/z value for a given HS oligosaccharide composition defines the range of positional isomers present in the mixture. As demonstrated here, the number of positional isomers may be constrained further using tandem mass spectrometric data.
For HS oligosaccharides, the number of positional isomers present for a given precursor ion composition is relatively large. The tandem mass spectrometric dissociation patterns for HS oligosaccharides is not entirely favorable to direct interpretation of structure in that uninformative losses of sulfate occur. Nonetheless, the tandem mass spectral profiles serve to differentiate HS from different tissues based on differences in product ion abundances. These observations are consistent with the conclusion that such patterns arise from different distributions of positional isomers in the tissue samples. We demonstrate that the use of AHC to analyze these data has three advantages. First, it provides a means of quality control for the tandem mass spectral data set. Second, it identifies the precursor ions for which significant differences in product ion abundances occur between the samples. Third, it identifies the product ions that have the most value for diagnostic purposes.
The level of fine structural detail produced by this AHC-based data analysis method is determined ultimately by the tandem MS parameters. In the present work, static infusion tandem MS was used. The advantage to this method was that the precursor ion charge states were higher than typically observed using on-line liquid chromatography MS. The disadvantage was that the lack of on-line separations contributed to the problem of co-isolation of different oligosaccharide compositions that produced similar m/z values. It is likely that the level of structural detail produced will increase through the use of LC/MS and or activated electron dissociation methods.
Even with the acknowledged limitations, the AHC method for analysis of tandem MS datasets shows great potential when applied to problems in glycomics. Complex positional isomer mixtures are typical in mixtures of released glycoconjugate glycans. The AHC method enables rapid analysis of large set of tandem MS data in a manner that permits discover of precursor and product ions that reflect the unique set of structures present in a given sample, and thus the underlying biological functions giving rise to these structures. As such, the method has significant potential value to the glycomics community.
This work was funded by NIH Grants P41RR10888, R01HL098950 and S10RR020946.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.