|Home | About | Journals | Submit | Contact Us | Français|
Hepatocellular carcinoma (HCC) represents an increasing health problem in the United States. Serum α-fetoprotein, the currently used clinical marker, is elevated in only ~60% of HCC patients; therefore, the identification of additional markers is expected to have significant public health impact. The objective of our study was to quantitatively assess N-glycans originating from serum glycoproteins as alternative markers for the detection of HCC.
We used matrix-assisted laser desorption/ionization time-of-flight mass spectrometry for quantitative comparison of 83 N-glycans in serum samples of 202 participants (73 HCC cases, 77 age- and gender-matched cancer-free controls, and 52 patients with chronic liver disease). N-glycans were enzymatically released from serum glycoproteins and permethylated before mass spectrometric quantification.
The abundance of 57 N-glycans was significantly altered in HCC patients compared with controls. The sensitivity of six individual glycans evaluated for separation of HCC cases from population controls ranged from 73% to 90%, and the specificity ranged from 36% to 91%. A combination of three selected N-glycans was sufficient to classify HCC with 90% sensitivity and 89% specificity in an independent validation set of patients with chronic liver disease. The three N-glycans remained associated with HCC after adjustment for chronic viral infection and other known covariates, whereas the other glycans increased significantly at earlier stages of the progression of chronic viral infection to HCC.
A set of three identified N-glycans is sufficient for the detection of HCC with 90% prediction accuracy in a population with high rates of hepatitis C viral infection. Further evaluation of a wider clinical utility of these candidate markers is warranted.
Hepatocellular carcinoma (HCC) is a major worldwide health problem and a cancer with increasing incidence in the United States (1, 2). The increasing incidence of HCC in the United States has been associated with hepatitis C viral (HCV) infection, and a further increase in HCC is predicted to occur over the next few decades (3). In the Egyptian population, up to 90% of HCC cases were attributed to HCV infection (4). Approximately 14% of the population in Egypt is infected with HCV and 7 million people are believed to suffer from a chronic liver disease (CLD; ref. 5). HCC is third in incidence among the cancer diseases in men, with >8,000 new cases predicted by 2012 in this population (6). Studies of HCV progression to HCC in Egypt are expected to provide new insights into the management of this increasingly significant health problem (7).
Chronic hepatitis develops in ~80% of those infected with HCV. Over the course of 20 years or more, 10% to 30% of HCV carriers develop cirrhosis; patients with cirrhosis have an annual risk of 1% to 2% for developing HCC (8) and their prognosis is generally poor. The currently available systemic therapies show only a modest response rate and have not been shown to improve survival in patients with HCC (9). A complete surgical resection and liver transplant are at present the only curative treatment options (10). However, the majority of patients present with advanced unresectable disease not amenable to definitive local therapies (11). The slow development and the late detection of HCC suggest that the identification of biomarkers of disease progression and early detection are attractive research strategies.
The current diagnosis of HCC relies on clinical information, liver imaging, and measurement of serum α-fetoprotein (AFP). The reported sensitivity (41–65%) and specificity (80–94%) of AFP are not effective for early diagnosis due to the high proportion of false negatives (12). The identification of effective markers for the early detection of HCC is an active area of research with several new marker candidates reported within the last few years (13, 14). It has been pointed out that many currently used cancer biomarkers, including AFP, are glycoproteins (15). Fucosylated AFP was introduced as a marker of HCC with improved specificity (16, 17), whereas other glycoproteins, including GP73, are currently under evaluation as markers of HCC (18, 19). The analysis of protein glycosylation seems particularly relevant to liver pathology because of the major influence of this organ on the homeostasis of blood glycoproteins (20, 21).
An alternative strategy to the analysis of glycoproteins is the analysis of protein-associated glycans (22, 23). The characterization of glycans in serum of patients with liver disease is a promising strategy for biomarker discovery (24). Current methods allow quantitative comparison of ~80 different permethylated N-glycan structures by matrix-assisted laser desorption/ionization time-of-flight (MALDI-TOF) mass spectrometry (MS; ref. 25). This is a rich source of information for molecular characterization of the disease process. In this study, we describe MS analysis of glycans in HCC in an Egyptian population (4, 26). We analyzed N-glycans, enzymatically released from serum glycoproteins, as candidate markers for the detection of HCC. The results show that the quantification of N-glycans has the potential to improve diagnosis of HCC.
Red-top Vacutainer blood collection tubes were obtained from Becton Dickinson. The peptide-N-glycosidase F (PNGaseF; EC22.214.171.124), isolated from Escherichia coli, was purchased from Associates of Cape Cod, Inc. Trifluoroethanol, 2,5-dihydroxy-benzoic acid (DHB), and sodium hydroxide were received from Aldrich. Chloroform, iodomethane, and sodium chloride were from EM Science. DTT and iodoacetamide were the products of Bio-Rad Laboratories. Ammonium bicarbonate was from Mallinckrodt Chemical Co., and acetonitrile was from Fisher Scientific. All other common chemicals of analytic grade were purchased from Sigma.
The study was designed to compare HCC cases (n = 73) with two groups of controls: controls without manifest liver disease (n = 77) and controls with CLD (n = 52). HCC cases and age- and gender-matched controls free of liver disease were enrolled in collaboration with the Cairo University School of Medicine, Egypt, from 2000 to 2002, as described previously (4). Briefly, adults who were seen at the cancer institute with newly diagnosed HCC, ages 17 y and older, without a previous history of cancer, were eligible for the study. Diagnosis of HCC was confirmed by pathology, cytology, imaging (computed tomography and ultrasound), and serum AFP levels. Controls were recruited from the orthopedic department of the school of medicine (4). The characteristics of this population are summarized in Table 1, which shows, as expected, increased prevalence of markers of viral infections [HCV RNA, anti-HCV, and anti–hepatitis B virus (HBV)] in cancer cases compared with controls (see also Supplementary Table S2; ref. 4). This comparison group allowed us to study the changes in glycans associated with HCV and HBV infections. The controls with CLD (n = 52), fibrosis (n = 22), and cirrhosis (n = 25) were recruited from Ain Shams University Specialized Hospital and Tropical Medicine Research Institute (Cairo, Egypt) during the same period. The diagnosis of liver disease in this group of controls was confirmed by ultrasound-guided liver biopsy; five remaining CLD controls did not have sufficient clinical information. The CLD controls negative for HBV infection but positive for HCV RNA and with AFP <100 ng/mL were enrolled in the study. The controls with AFP >100 ng/mL were excluded to minimize the inclusion of undetected HCC cases into the control group. This comparison group served to evaluate the ability of selected glycans to identify HCC in the background of CLD. All participants signed informed consent, provided a blood sample, and answered a questionnaire on demographic information, personal habits, medical history, and occupational history. The study protocol was approved by the institutional review committees of all participating institutions and conformed to the ethical guidelines of the 1975 Helsinki Declaration.
Blood samples were collected by a trained phlebotomist each day at around 10 a.m. and processed within a few hours according to a standardized protocol. Aliquots of sera were frozen immediately after collection and kept at −80°C until analysis; all MS measurements were done on twice-thawed sera. Each patient’s HBV and HCV infection status was assessed by enzyme immunoassay for anti-HCV, anti-HBC, and hepatitis B surface antigen and by PCR for HCV RNA (4, 27). We obtained cancer stage information on 51 cases, with 18 cases classified as early (stage I and II) and 33 cases as advanced (stage III and IV) according to the American Joint Committee on Cancer staging system (28); for the remaining cases, the available information was not sufficient to assign the stage.
Human serum samples were reduced and alkylated as described previously (22, 25). Briefly, a 10-μL aliquot of serum was added to 150 μL of 25 mmol/L ammonium bicarbonate and 2.5 μL of 200 mmol/L DTT before incubation at 60°C for 45 min. A 10-μL aliquot of 200 mmol/L iodoacetamide was added and allowed to react at room temperature for 1 h in the dark. Subsequently, a 2.5-μL aliquot of DTT was added to react with the excess iodoacetamide. The reaction mixture was diluted with 100 μL of ammonium bicarbonate to adjust the pH to 7.5 to 8.0 for the enzymatic release of N-glycans using PNGaseF. Next, a 5-milliunit aliquot of PNGaseF was added to the mixture before incubation overnight (18–22 h) at 37°C.
The volume of enzymatically released glycans was adjusted to 1 mL with deionized water and applied to a C18 Sep-Pak cartridge (Waters), which was preconditioned with ethanol and deionized water as described previously (25). The reaction mixture was circulated through the cartridge five times to retain peptides and O-linked glycopeptides. Glycans were present in the pass-through and the 0.25-mL deionized water washes. The combined eluents were then passed over activated charcoal microcolumns (Harvard Apparatus) preconditioned with 1 mL of acetonitrile and 1-mL aqueous solution of 0.1% trifluoroacetic acid. The microcolumn was washed with 1 mL of 0.1% trifluoroacetic acid and samples were eluted with 1 mL of 50% aqueous acetonitrile with 0.1% trifluoroacetic acid. The purified N-glycans were evaporated to dryness using vacuum CentriVap Concentrator (Labconco Corp.) before solid-phase permethylation.
The permethylation was carried out as described in a recent report (25). Tubes, nuts, and ferrules from Upchurch Scientific were used to assemble the sodium hydroxide capillary reactor. Sodium hydroxide powder was suspended in acetonitrile and packed into Peek tubes (1 mm i.d.; Polymicro Technologies) using a 100-μL syringe from Hamilton and a syringe pump from KD Scientific, Inc. The sodium hydroxide reactor was conditioned with 60 μL of DMSO at 5 μL/min flow rate. Purified N-glycans were resuspended in a 50-μL aliquot of DMSO with 0.3 μL water and 22 μL methyl iodide. This permethylation procedure has been shown to minimize oxidative degradation and peeling reactions and to eliminate excessive cleanup. Sample was infused through the reactor at 2 μL/min and washed with 230 μL acetonitrile at 5 μL/min. All eluents were combined, whereas permethylated N-glycans were extracted using 200 μL chloroform and washed thrice with 200 μL water before drying.
Permethylated glycans were resuspended in 2 μL of methanol/water (50:50) solution. A 0.5-μL aliquot of the sample was spotted on a MALDI plate and mixed with an equal volume of DHB matrix [10 mg DHB in 1 mL of methanol/water (50:50) containing 1 mmol/L sodium acetate to promote formation of sodium adducts in MALDI-MS]. The MALDI plate was dried under vacuum to ensure uniform crystallization. Mass spectra were acquired using an Applied Biosystems 4800 MALDI TOF/TOF Analyzer (Applied Biosystems, Inc.) equipped with a Nd:YAG 355-nm laser, as described previously (22). MALDI spectra were recorded in the positive ion mode because permethylation eliminates the negative charge normally associated with sialylated glycans (29).
Raw spectra were exported as text files for further analysis. Each spectrum consisted of ~121,000 m/z values with the corresponding intensities in the mass range of 1,500 to 5,500 Da.6 Analyses were carried out in MATLAB (MathWorks) and Statistical Analysis System (SAS, Inc.) software packages; overlay of spectra was created in ClinProTools (Bruker Daltonics). MALDI-TOF mass spectra were processed as described previously (30). Briefly, the dimension of each spectrum was reduced to 13,030 bins (100 ppm step). Baseline-corrected spectra were normalized by dividing each spectrum by its total ion current. We analyzed a total of 203 spectra. One spectrum was eliminated due to truncated acquisition. Peak identification was carried out on a randomly selected training set of 74 samples (25 HCC, 24 controls, and 25 controls with CLD). After scaling the peak intensities to an overall maximum intensity of 100, local maximum peaks above a specified threshold were identified, and nearby peaks within 300 ppm mass were coalesced into a single window to account for drift in m/z location. This procedure identified 85 peak-containing windows; the maximum intensity in each window was used as the variable of interest. The threshold intensity for peak identification was set so that isotopic clusters were represented by a single peak. The isotopic cluster at 1,543 to 1,547 Da was the only cluster resolved by the procedure to three individual peaks; we grouped this cluster to one variable before all analyses. This resulted in a final comparison of 83 variables (glycan intensities). Logistic regression models were used to determine the association of the glycans and covariates, including HCV and HBV infections (independent variables), with HCC status (dependent variable). Glycan intensities were dichotomized for the regression analyses of HCC by the median of the appropriate control group (population or CLD controls separately).
For determination of prediction accuracy and construction of receiver operating characteristic (ROC) curves, the 74 spectra randomly selected for window definition were used as a training data set and the remaining 128 spectra served as a blinded validation set (48 HCC, 53 controls, and 27 controls with CLD). Spectra in the blinded validation set were processed using the same criteria described above for the 74 training samples. We performed two sets of analyses: one comparing intensity of all 83 m/z windows and the other using only 48 glycans with assigned structure. A hybrid algorithm that interfaces ant colony optimization with support vector machine (ACO-SVM) was used to select six mass windows (N-glycan peaks) for classification of HCC based on comparison of HCC cases and controls from the training data set as described previously (30). Briefly, the algorithm was run 100 times to select five peaks at a time. Each run consisted of 500 iterations. A 4-fold cross-validation was used to estimate the classification accuracy. A frequency plot was used to select m/z windows from the 100 runs that occurred >50% of the time. A SVM classifier was used to classify the blinded spectra using the selected peaks. Sensitivity and specificity of the marker candidates were evaluated on the blinded validation data set, including the set of controls with CLD.
MALDI-TOF MS analysis of permethylated N-glycans, which were enzymatically detached from serum glycoproteins, allowed relative quantification of 83 oligosaccharides. We analyzed a total of 202 serum samples using previously described methods with only minor modifications (30). Comparison of average spectra in HCC cases (n = 73) and controls (n = 77) showed marked differences in glycan abundance (Fig. 1). Analysis of the 83 peak intensities by t test showed significant differences (P < 0.01) in the abundance of 57 glycans; we chose P < 0.01 to adjust for the multiple comparisons. Details of the analysis are presented in Supplementary Table S1 together with description of the known glycan structures. Supplementary Fig. S1 provides a graphical overview of the spectra in the mass range of 1.5 to 5.5 kDa.
Structural composition of 46 of the 83 N-glycans was determined by a combination of enzymatic sequencing and tandem MS as described previously (29, 31). To select candidate markers for detection of HCC, we focused on the glycans with known structure to allow a robust validation of our selection in other laboratories. We selected 6 of the 48 N-glycans as candidate markers for classification of HCC by ACO-SVM computational methods, as described previously (30). These six glycans were selected with >50% frequency in 100 repeats of the ACO-SVM algorithm carried out with 25 HCC and 24 population control spectra. Association of the glycans and covariates (age, gender, HCV and HBV infections, and smoking) with HCC (dependent variable) was analyzed by univariate logistic regression using 73 HCC and 77 control spectra (Supplementary Table S2). Glycan intensities were dichotomized by the median value in population controls; the analysis of glycans as continuous variables did not substantially affect the outcome (data not shown). The logistic regression analysis showed that serologic markers of viral infections and five of the six selected glycans were strongly associated with HCC. The association of the sixth glycan with HCC bordered on significance and became significant after adjustment for HCV infection (see below).
To evaluate the association of these N-glycans with viral infections, we analyzed the association of each of the six N-glycans (independent variable) and viral infections (dependent variable) in the population controls (32% positive for HCV antibodies and 52% positive for HBV antibodies; see Table 1). None of the selected N-glycans was associated with the presence of HBV antibodies; glycans 2 and 4 were associated with HCV antibodies (Supplementary Table S3). Next, we used six multivariate regression models to evaluate the association of each glycan (independent variable) with HCC (dependent variable), following an adjustment for matching variables (age and gender) and for markers of HCV infection (Table 2). All six N-glycans were associated with HCC following the adjustment. We did not include HCV RNA in the regression models because it is correlated with anti-HCV (correlation coefficient = 0.823).
Next, we examined the six glycans in comparison with the CLD control group, which is the most clinically relevant group in need of new markers for the detection of HCC. This group of participants had a biopsy-confirmed fibrosis (n = 22) or cirrhosis (n = 25). Glycans 1, 3, and 6 were less abundant in HCC cases, whereas the three remaining N-glycans increased in intensity from population controls to CLD to HCC. Multivariate logistic regression comparing HCC cases (n = 73) with CLD controls (n = 52) showed that glycans 1, 5, and 6 remained significantly associated with HCC after adjustment for age and gender (Table 3). We did not adjust for viral infection in this analysis because all participants in the CLD group were HBV negative and HCV positive and HBV was not associated with any of the glycans (see Supplementary Table S3). Because all CLD and 80% of HCC cases carry HCV viral infection, the result strongly suggests that the observed change in N-glycans is associated with HCC. Descriptive statistics for the six glycans in population controls, CLD controls, and HCC are presented in Supplementary Table S4.
Figure 2 shows the ROC curves of the three individual glycans that are different compared with CLD controls, as well as their combination, in a blinded, independent validation set of HCC cases (n = 48) and CLD controls (n = 27). The area under the ROC curve for individual glycans ranged from 89% to 93%, whereas the combined classifier has a sensitivity of 90% and specificity of 89% in the blinded, independent validation set. Glycans 1 and 6 are triantennary and tetraantennary complex glycans that decrease in HCC. Glycan 5 is a bisecting glycan that increases in HCC patients. This is consistent with the general trends of changes observed in our study as discussed below.
To further evaluate the potential of the three glycans for early detection of HCC, we analyzed the progressive glycan changes from fibrosis to cirrhosis and early cancer (stage I and II) disease (Fig. 3). It is interesting to note that 17 of the 18 early cancers and 23 of 25 cirrhotic controls were classified correctly by the three glycans. The prediction accuracy and Fig. 3 suggest that the three selected glycans separate efficiently cirrhosis controls from early-stage HCC; we verified by regression analysis that all three glycans are significantly different at P < 0.05 between cirrhosis controls (n = 25) and early-stage HCC (n = 18). Because AFP was used as a selection criterion for the CLD group (<100 ng/mL included), we could not compare the prediction accuracy directly to AFP. However, 30% of the cases had AFP <200 ng/mL; of these 22 HCC cases, 18 were correctly classified by the three glycans.
AFP is the only serologic marker used currently in the clinical detection of HCC. Although AFP improves the detection of HCC, a significant number of HCC patients present without elevated AFP so that additional markers are needed to increase the sensitivity and specificity of HCC detection (32, 33). Our study describes MALDI-TOF quantification of glycans enzymatically detached from serum proteins and shows that the three selected N-glycans are excellent candidates for the detection of HCC.
The analysis of glycan changes in progression of liver disease to cancer is not unprecedented (20, 34). However, only the recent methodologic advances have made it possible to do quantitative comparative analysis of glycans in different groups of patients (23, 22). Two elegant recent studies of liver disease compared a total of 14 glycans, whereas the authors selected the ratio of two glycans to classify HCC with HBV etiology with 57% sensitivity and 88% specificity compared with CLD controls (24, 35). Our study of HCC with HCV etiology selected three glycans (m/z 2,472.9, 3,241.9, and 4,052.2), which jointly predict HCC with 90% sensitivity and 89% specificity compared with the controls with CLD. A direct comparison of the results is not possible because our method preserves sialylated structures, which are enzymatically removed in the methods used in the other studies. The difference in sensitivity and specificity observed in our study may be related to the methodologic differences, the glycans under study, etiologic HCC differences, or characteristics of the populations. However, all of these studies strongly suggest that examination of glycans in the progression of CLD to cancer has a great diagnostic potential. This observation likely reflects a major influence of the liver on the steady state of serum N-glycans and its perturbation in pathophysiology of this organ.
In this study, we compared HCC cases with two groups of controls: a population control group and a hospital-based control group with a biopsy-verified CLD. Overall, the differences in glycosylation between HCC cases and population controls are substantial with 57 of 83 N-glycans differentially abundant, as differentiated by t test at P < 0.01. We selected three glycans with known structure as candidate markers for the detection of HCC on a CLD background. The selection is based on prediction accuracy and facilitates the choice of the glycans for disease classification (30). The following precautions were used to limit the possibility of false discoveries (36, 37). A substantial proportion of controls carry HCV infection; it is not a “convenience sample” (4). A standardized sample collection and processing protocol was used to minimize variability; the analytic methods were optimized to a mean coefficient of variation of ~15% (25). Glycan quantification/selection followed established procedures with evaluation of prediction accuracy of marker candidates on a blinded validation set of samples (30). To evaluate the prediction accuracy, we included a control group with CLD. We provide preliminary evidence that the three glycans can identify early-stage HCC in the background of liver cirrhosis in this population. In addition, the sugar composition of the three glycans selected as candidate markers for detection of HCC was determined. Further examination is needed to verify the initial observation in a larger group of patients and the validation process will have to include prospectively collected patient samples.
Grouping of the glycans with determined structural composition into high-mannose, hybrid, and complex structure types showed some general trends. As expected, the complex structures represent the majority of the observed glycans (n = 27), with 18 of the complex structures core fucosylated. We detected a relatively high number of bisecting glycans (n = 9), with four of them core fucosylated. We also detected five high-mannose structures and five hybrid glycans of which one was core fucosylated. As described previously, core fucosylation is a frequent event in liver disease (20, 38); eight of the core-fucosylated complex structures increased in HCC compared with the controls, including CLD. In contrast, eight of nine complex structures without core fucosylation decreased in HCC. All four hybrid structures increased in CLD compared with controls, but only the core-fucosylated hybrid glycan was higher in HCC compared with CLD. Seven of nine bisecting glycans (with or without core fucosylation) increased in CLD and four of nine further increased in HCC. Similarly, three of five high-mannose structures increased in CLD, and four of five further increased in HCC. Sialylation results are mixed, with 15 of the 32 sialylated glycans higher in cancer compared with control, 9 higher in control compared with cancer, and 8 unchanged. The biological influences driving these changes need further examination.
In conclusion, we have identified three glycans as candidate markers for the detection of HCC in patients with CLD. Each of the three glycans has an area under the ROC curve between 0.89 and 0.93 for the detection of HCC compared with CLD and the markers efficiently detect early cancer. The individual marker candidates have good prediction accuracy in this population, in general, comparable or superior to AFP (32). The combination of the three glycans performs with sensitivity of 90% and specificity of 89% in the independent validation set of samples. Although the study set is not large and the generalizability of these observations in populations with a more heterogeneous disease etiology needs to be examined, the results presented here seem quite encouraging.
Defining clinically applicable markers of early-stage hepatocellular carcinoma has the potential to improve detection and prognosis of the disease. There is a pressing need to identify biomarkers of hepatocellular carcinoma complementary to the currently available methods. These markers would be used to screen high-risk populations, evaluate disease progression, and test new intervention strategies.
Grant support: This work was supported in part by an Associate Membership from the National Cancer Institute Early Detection Research Network, National Cancer Institute grants R03 CA119288 and R01 CA115625-01A2 (R. Goldman). The collection of specimens and data from the study population was supported by NIH grant RO1CA85888 (C.A. Loffredo). The methodologic and bioanalytic aspects of this study were facilitated by the National Center for Research Resources center grant RR018942 to the National Center for Glycomics and Glycoproteomics at the Department of Chemistry of Indiana University.
Disclosure of Potential Conflicts of Interest
No potential conflicts of interest were disclosed.
6The files are available at http://microarray.georgetown.edu/web/files/glycans.zip.