Tandem MS on HS oligosaccharides
Interpretation of the tandem mass spectra of a single composition yields significant but incomplete structural information. The main reason is that a single composition generated by lyase III depolymerization yields a complex mixture of positional isomers. Additionally, the complexity of the MS2
data is increased by losses of labile sulfate groups from the precursor and product ions obtained during CAD fragmentation. The loss of sulfate from the precursor ions and product ions is dependent on the number of sulfate groups and the charge state of the ions involved. As the charge state of a precursor ion increases, the repulsion force between the sulfate group increases, thus favoring glycosidic bond dissociation or cross-ring cleavages [13
]. In our experience, when the charge state is lower than the number of sulfate groups in the precursor ion, the tandem mass spectrum is dominated by loss of SO3
Representative tandem mass spectra of the compounds [1,2,3,4,1]3− (m/z 456.7) and [1,2,3,4,1]4− (m/z 342.3) obtained from bovine aorta tissue are shown in . Each HS composition is represented by the numerical code [A, B, C, D, E] corresponding to the number of different of the following carbohydrates [ΔHexA, HexA, GlcN, SO3, Ac]. The tandem mass spectrum of the [1,2,3,4,1]−3 (m/z 456.7) (a) shows abundant product ions from loss of SO3 from the precursor and is therefore nearly devoid of structurally informative ions. In contrast, the tandem mass spectrum of [1,2,3,4,1]−4 (b) is rich in information and contains product ions that clearly identify the position of the acetate group. The product ions B1 and [M-0,2X] indicate that the first hexuronic acid at the non-reducing end does not carry a sulfate group. The complementary product ions Y2(1S,Ac)/B4(3S) and the product ions 0,2A6 (4S) unambiguously localize the acetate group to the first reducing end of the glucosamine residue. The product ion (m/z 574.0792) illustrated another complexity of the tandem data: it could be assigned as B3(1S) or 0,2A3(1S,Ac) product ions with the same error (0.8ppm). The ambiguity could be solved using MS3on this ion, which is not possible here due to low ion abundance. For either interpretation, the data indicate the presence of an isomer with acetate at the glucosamine in position 5, adjacent to the non-reducing end. The isomer with the acetate located at glucosamine 1 (reducing end) is dominant compared to the isomer with acetate at glucosamine in position 5. The relative intensity of the product ions indicating the acetyl group localized on glucosamine 5 is less than 5% while the product ions showing the presence of the acetate at glucosamine 1 ranges between 10% and 70%. A simple combinatorial calculation yields 378 possible isomers for the glycan composition [1,2,3,4,1]. The restriction posed by the predetermined location of the acetate group and by the lack of a sulfate group at the first hexuronic acid at the non-reducing end decreases this number to 70 possible combinations.
Figure 2 Interpretation of tandem mass spectra of HS components: (a) [1,2,3,4,1]−3 (m/z) and (b) [1,2,3,4,1]4− (m/z) precursors obtained from bovine aorta tissue. Influence of the fragmentation pattern with the charge state and number of sulfates. (more ...)
Conclusive determination of the location of the sulfate group on the chain is further complicated by the following: product ions may also be generated from sulfate losses from more highly sulfated precursors. This is indicated by the presence of the product ion 0,2
X-S] and Y5
(3S) in the tandem mass spectrum of the compound [1,2,3,4,1]4−
at 100%, 20% and 35% relative abundances (2b). Loss of sulfate from the product ions becomes more pronounced as the size of the oligosaccharide and the number of sulfates increase and modeling this mechanism is challenging. One way to counter the problem is to compare directly the pattern of fragmentation for an unknown precursor ion with that of a pure standard of known structure [13
] and define the proportion of isomers present. Alternatively, the separation of individual isomers in the mixture may be performed to go further in the interpretation of the tandem mass spectrometry data; however, this is extremely difficult from a chromatography point-of-view and only justified for target oligosaccharides with known high value biological activities. It is therefore appropriate to develop an informatics approach to facilitate structure-function correlation from complex mixtures.
Agglomerative Hierarchical Clustering
Developing methods for global characterization of the differences in isomeric structures of HS preparations for the purpose of biomarker discovery [61
] or for analysis of therapeutic HS preparations [64
] is a primary focus for the present work. Regarding biomarker discovery, it is important to consider the need to screen a large number of compounds in a single experiment. Additionally, characterization of the isomerization of GAGs is crucial in determining their biological function [65
]. The goal here, using a statistical classification approach with AHC, is to establish a classification-map of tandem MS data that will clearly afford detailed isomeric determination and classification of the HS oligosaccharides dependent upon tissue type and precursor ion m/z
. The map would thus yield an exact representation of the tandem mass spectrometric data in terms of similarity of the precursor ions and would thus represent the dataset of features in both the MS and tandem MS space.
We performed two exploratory avenues of data processing using AHC. The first involved similarity clustering based upon precursor m/z to determine if it was feasible to differentiate HS obtained from lung and aorta and to determine the degree of similarity or difference in isoforms of identical m/z precursor ions. The second approach involved application of AHC using dissimilarity measurements on the tandem mass spectra ion series for each unique cluster obtained in the similarity space. The goal was to identify the product ions different in abundances between tissue types and to determine fine structure underlying these differences.
Agglomerative hierarchical clustering was performed on the tandem mass spectra obtained from the most prominent 13 precursor ions obtained from the lyase digest of HS obtained from bovine lung and aorta tissues. Results of similarity clustering of the 13 precursors are summarized in and demonstrate that all precursor ions clustered dependent upon the origin of the tissue and that precursors with the same m/z from lung and aorta clustered together. We observed that considerable variability in the distance measurement existed in the clustering of the precursors, ranging from 0.99, indicating near identical tandem mass spectra, to 0.573, indicating the presence of either novel ions in the tandem mass spectra or differences in ion intensities originating from the different tissues. A complete summation on our observations on similarity reflecting the degree of clustering of the precursors are presented in more detail below and summarized in .
Figure 3 Agglomerative hierarchal clustering of heparan sulfate tandem mass spectra from aorta (A) and lung (L). The dendrogram illustrates by clustering (X axis) and distance measurement (Y axis) the degree of similarity of the tandem mass spectra originated (more ...)
Overview of clustering for 13 precursor ions common to aorta and lung tissue in terms of and differentiation of glycoforms. DP = degree of polymerization.
AHC confirms the quality of the tandem mass spectra
The reproducibility of tandem mass spectra is mirrored by the degree of clustering. We observed that if the replicate analyses of each precursor ion were reproducible, the tandem mass spectra produced from the same tissue were observed to cluster together. For the 13 precursor ions studied, 10 demonstrated tight clustering indicating acceptable reproducibility: [1,2,3,3,1]−4
(m/z 322.3), [1,2,3,3,1]−3
(m/z 430.1), [1,2,3,4,1]-4 (m/z 342.3), [1,1,2,2,0]−2
(m/z 416.1), [1,1,2,2,0]−3
(m/z 277.3), [1,1,2,3,0]−3
(m/z 303.7), [1,1,2,2,1]−2
(m/z 437.1), [1,1,2,3,1]−3
(m/z 317.7). We observed that two precursors ions [1,0,1,1,1]−2
(m/z 228.5) and [1,1,2,2,1]−3
(m/z 291.0) clustered well except for a single replicate. The ion intensity of this replicate in both cases was significantly lower than the other replicates. The tandem mass spectra of [1,1,2,2,1]−3 (m/z 291.0), shown in , illustrates this observation. While the total ion current of aorta replicate 3 was on the order of 5E+4 counts, the other replicates were on the order of 1E+6 counts (4a). Furthermore, only aorta replicate 3 contains evidence of the precursor ion M (m/z 291.0).
Figure 4 Tandem mass spectra of [1,1,2,2,1]−3 (m/z 291.0) obtained from bovine (a) aorta (b) lung tissues. Tandem MS data does not allow for differentiation of glycoforms specific to the tissue type. The product ions that differ in abundance significantly (more ...)
We observed three precursor ions [1,2,3,4,1]−5
, and [1,1,2,3,1]−4
which did not cluster ideally. In general for these compounds, the reproducibility of the tandem mass spectra was poor due to a lack of sensitivity in the tandem mass spectra (see for example Supplemental Figure 3
) and low signal-to-noise ratio. The fact that the most intense ions were not constant from one replicate to the other perturbs the clustering analyses. This perturbation was also observed as higher P values in the original ANOVA analyses of these tandem MS clusters. Poor reproducibility in tandem mass spectra ultimately is reflected in ANOVA with P values > 0.05. The lack of reproducibility of the [1,0,1,2,0]−3
resulted from co-isolation of precursor from the compound [1,1,2,4,0]−4
. Thus AHC proved to be an effective measure to indicate the reproducibility or the lack thereof of the tandem mass spectrometry data.
AHC across precursor ions quantify the degree of similarity of the tandem mass spectrometric data
AHC was performed across the precursor ions in order to determine which may be considered as good candidates for differentiation of HS obtained from aorta and lung tissues. For this type of clustering, distance measurement affords a relative-quantitative measure of the differences between the tandem mass spectra observed within this dataset and ranged from values of 0 to 1. A value of 1 indicates that the tandem mass spectra are nearly identical. In the distance measurements are shown for the 10 precursor ions that demonstrated ideal clustering dependent upon their origin and yielded 3 distinct groups of clustering.
The first group contained 2 precursors, the component [1,1,2,3,0]−3
(m/z 303.7) and [1,1,2,2,1]−3
(m/z 291.0) which had a distance measurement equal to 0.99 indicating that the spectra were nearly identical. This is illustrated in which shows the tandem mass spectra of the precursor ion [1,1,2,2,1]−3
(m/z 291.0). Both components [1,1,2,2,1]−3
(m/z 291.0) and [1,1,2,3,0]−3
(m/z 303.7) (not shown) from both tissues were identical in terms of glycoform composition and were thus not good candidates to distinguish HS from the tissues.
The second group was composed of 4 precursors, the components [1,1,2,2,0]−2
(m/z 416.1), [1,1,2,2,0]−3
(m/z 277.3), [1,1,2,2,1]−2
(m/z 437.1) and [1,1,2,3,1]−3
(m/z 317.7). Their distance measurements of the similarity clustering ranged between 0.969 and 0.937 indicating that a small degree of difference existed between the tandem mass spectra obtained from lung and aorta tissues. Manual comparison of the tandem mass spectra of the compounds [1,1,2,2,0]−2
(m/z 416.1), [1,1,2,2,0]−3
(m/z 277.3) and [1,1,2,3,1]−3
(m/z 317.7) did not indicate noteworthy differences. Tandem mass spectra of component [1,1,2,2,1]−2
(m/z 437.1) indicated some noticeable differences in the product ion intensities between the aorta and lung tissues including: 0,2X2 (1S) (m/z 300.0), B3 (2S) (m/z 326.5) and Y3 (2S,Ac) (m/z 358.0). These are illustrated in . We also observed the co-isolation of the compound [1,3,4,5,0]−4
(m/z 436.0386) (−2.4 ppm) in the tandem mass spectra from the lung tissue (5b). The relative intensity of this component was less than 5% of the intensity of the target compound and contributed minimally to the tandem mass spectra, therefore it did not contribute to the intensity of the product ion Y3 2SAc (m/z 358.0). The component [1,1,2,2,1]−2
(m/z 437.1) is consistent with presence of different positional isomers in aorta vs. lung tissue. Of note is the fact that the product ion pattern for precursor ion [1,1,2,2,1]−3
(m/z 291.0) was similar for the lung and the aorta. Charge state influences product ion pattern, and it is therefore not surprising that the degree to which diagnostic product ions depends on charge state; however, the charge state distribution differs depending on the tissue type, consistent with the conclusion that tissue-specific isomers influence charge state.
Figure 5 Tandem mass spectra of [1,1,2,2,1]−2 (m/z 437.5) obtained from bovine (a) aorta (b) lung tissues. Tandem MS data allowed for differentiation of glycoforms specific to the tissue type. The product ions that differ in abundance significantly between (more ...)
The third group contained 4 precursor ions, [1,2,3,3,1]−4
(m/z 322.3), [1,2,3,3,1]−3 (m/z 430.1), [1,2,3,4,1]−4
(m/z 342.3) and [1,0,1,1,1]−2
(m/z 228.5). These were present with distance measurements ranging from 0.573 to 0.797. These distance measurements were higher in value than for the compounds previously described because they arose due to the most abundant product ions, resulting from significantly different isomer composition rather than minor differences in the less abundant product ions. The tandem mass spectra of precursor ion [1,2,3,3,1]−4
(m/z 322.3) are presented in and indicate that the most intense product ion is Y5 (3S,Ac) (m/z 377.5) for the lung tissue and 0,2A6 (3S) (m/z 396.4) for the aorta tissue. The difference between the tandem MS data is due to intensity and not the appearance of new product ions. The verification of the window of isolation is important to be sure that the difference between tandem mass spectra from the 2 tissues is due to HS isomers specific to lung versus aorta and not from a co-isolation of unrelated compounds. indicates the precursor ion [1,2,3,3,1]−4
(m/z 322.3) are perfectly isolated for the aorta tissue. The window of isolation of the lung tissue contains a co-isolation of the component [0,2,2,3,1]−3 (m/z 323.6901) which accounted for less than 5% of the signal. The ions Y5(3S,Ac) (m/z 377.5) and 0,2A6(3S) (m/z 396.5) could not be assigned to any product ions of the structure [0,2,2,3,1]−3 (m/z 323.6901), consistent with the conclusion that the differences observed between the tandem mass spectra of the component [1,2,3,3,1]−4
(m/z 322.3) were significant and reflected the presence of different glycoforms in the both tissues. By contrast, the compound [1,1,2,3,1]−4 (m/z 228.5) clustered properly between lung and aorta tissue but this observation is due to a co-isolation of the compounds [1,1,2,3,0]−4
(m/z 227.5) at different intensity in the lung (<30%) and the aorta (>70%) tissue.
Figure 6 Tandem mass spectra of [1,2,3,3,1]−4 (m/z 322.3) obtained from bovine (a) aorta (b) lung. Tandem MS data allowed for differentiation of glycoforms specific to the tissue type. The product ions that differ in abundance significantly between aorta (more ...)
The results summarized in indicate the isolation of the compositions [1,2,3,4,1]−4
(m/z 342.3) and [1,2,3,3,1]−3
(m/z 430.1) were perfectly clean. This ideal clustering of both compositions from aorta and lung is consistent with the presence of different positional isomers in the lung versus aorta tissues. We observed (data not shown) that the most significant differences in intensity for the product ions for the aorta and lung tissues were 0,2A6 (3S) (m/z 396.4)//[M-S] (m/z 322.8) for the component [1,2,3,4,1]−4
(m/z 342.3) and B4 (2S) (m/z 407.0)//[M-S] (m/z 403.4) for the component [1,2,3,3,1] −3
(m/z 430.1). The observations are consistent with the conclusion of the existence of positional isoform differences between HS obtained from the lung and aorta tissues.
We hypothesized that the primary reason the components clustered together was due to a common structural core. Seven common product ions were clearly identified at different intensities between lung and aorta HS: B1
574.1). The presence of 0,2
396.4) and the product ion 0,2
574.1) clearly indicates the common HS core contains the acetate group and it could be identified for both those compounds at the first and fifth glucosamine residue of the oligosaccharide. While this core structure containing the sulfate compounds is plausible, our current study does not define this more precisely. To fully elucidate the structures of these compounds and characterize all possible compositional isoforms including position of sulfates that may give rise to diagnostic ions, it will be necessary to isolate the individual components by some means of separation. One possible solution to provide separation space that would afford sufficient resolution to solve this challenging problem is ion mobility mass spectrometry, an area of research that has been recently applied to the characterization of HS [10
AHC by dissimilarity across product ions allows defining of structural information
AHC using dissimilarity measurements was performed on the tandem mass spectra ion series for each unique cluster obtained in the similarity space in order to determine which product ions were diagnostic for tissue type for each precursor m/z. The goals were to determine fine structure in homology and feature differences in product ion mass spectra from near neighbor clusters that would be indicative of structural homology and to determine the major differences in product ions resulting from different distributions of positional isomers of potential use as diagnostic marker ions. The distance measurement of AHC by dissimilarity is represented in this case as follows: a value of 1 indicates that the product ions differed significantly between tandem mass spectra of the same precursor ions while a value of 0 indicates homology.
Example results of dissimilarity clustering are provided in , which illustrates the clustering of the 3 general groups described above; (a) for precursor ion [1,1,2,2,1]3− (m/z 291.0), no major differences in product ion abundances between lung and aorta tissue, (b) for precursor ion [1,1,2,2,1]2− (m/z 437.5), slight differences in product ion abundances and (c))[1,2,3,3,1]4− (m/z 322.3), major differences in product ion abundances for HS oligosaccharides obtained from lung and aorta tissues. In depicting the tandem MS clustering of the component [1,1,2,2,1]−3 (m/z 291.0), we observed no major differences between the groups which was observed with a dissimilarity measurement of less than 0.05. The only feature that showed a difference, distance measurement equal to 1, was the most intense product ion in the tandem mass spectra, 0,2A4 (2S). We observed this perturbation in distance measurements consistently throughout the application of AHC when the product ion was the most intense in the tandem mass spectra.
Figure 7 Agglomerative hierarchical clustering of product ions obtained from the components (a) [1,1,2,2,1]−3 (m/z 291.0), (b) [1,1,2,2,1]−2 (m/z 437.5) and (c)[1,2,3,3,1]−4 (m/z 322.3) obtained from bovine aorta and lung tissues. The product (more ...)
In the second group, represented here in depicting the tandem MS clustering of the component [1,1,2,2,1]−2 (m/z 437.1), we observed some differences between the aorta and lung groups. Different clusters of product ions were observed. The first one included the most intense ion [M-S] (m/z 397.1) with a distance measurement equal to 1. The second cluster included the features Y3 (2S,Ac) (m/z 358.0) and B3 (2S) (m/z 326.5) with a distance measurement equal to 0.13. The last clustering with a distance measurement of < 0.13, is represented by the group of product ions B3(1S) (m/z 574.1), M (m/z 437.1), 0,2A4 (S,Ac) (m/z 346.7) and 0,2A4 (2,S) (m/z 386.6). The features Y3 (2S,Ac) (m/z 358.0) and B3 (2S) (m/z 326.5) constituted the diagnostic ions and represent the difference isomeric glycoforms present in the aorta and lung tissues. The distance measurement of 0.13 indicates a relatively moderate difference between the sets of tandem mass spectra.
In the third group, presented in depicting the tandem MS clustering of the component [1,2,3,3,1]−4 (m/z 322.3), major differences were observed between the aorta and lung groups. Here, two product ions Y5 (3S,Ac) (m/z 377.5) and 0,2A6 (3S) (m/z 396.4) clustered together with the distance measurement equal to 1. Both features clearly indicated the presence of different positional isoforms in HS from lung and aorta tissues and their variation of intensity between lung and aorta were significant. The second group also contained the first sub cluster C4 2S (m/z 416.2).