To build a single representation of the glycome, the features contributing most to the variation between structures had to be defined. An evaluation of all human glycan structures from GlycomeDB (2
), in the context of known glycosylation pathways, revealed that N- and O-linked structures could be categorized using just three criteria: (i) the type and shape of the core structure, (ii) the nature and length of any chain, and (iii) the nature of any terminal epitopes (e.g. sialylation, A or B antigen). The relationships between these criteria were also captured.
To summarize a set of glycan structures, these criteria are applied systematically. Each input structure is traversed from the reducing terminus to non-reducing termini and each of the criteria, above, are evaluated against each of the residues. A decision is made to display, annotate or ‘compress’ each of the residues. Statistics describing the number of structures that have particular features (e.g. chain types or terminal epitopes) are calculated. Structures from any set that appear incomplete or erroneous, which are inconsistent with the criteria, are removed. The final high confidence set of structures is used to build a composite structure, from the union of all supplied structures. Separate composite structures are built for N- and O-linked sugars. To visualize these composite structures, a modified CFG schema is used to show the criteria of shape, nature and length, and terminal epitopes. Annotations to represent the statistics are also built into the graph. Histograms to quantify branching are shown alongside, together with names of any branch types.
The summarizing process has been built into the GlycoViewer tool (http://www.systemsbiology.org.au/glycoviewer
). Lists containing up to hundreds of structures can be submitted, for example from databases such as GlycomeDB or GlycoSuiteDB (2
). Structures must adhere to IUPAC nomenclature. Alternatively, a structure builder is supplied so lists can be constructed as required and then analysed. The tool is freely available and has no login requirement. Detailed instructions on the interpretation of the tool’s output are given on the web site, on the page titled ‘Interpreting the Output’ and are given here as Supplementary Data