|Home | About | Journals | Submit | Contact Us | Français|
The aims of this study were to demonstrate the feasibility of centrally collecting and processing high-quality CSF samples for proteomic studies within a multi-center consortium and to identify putative biomarkers for medulloblastoma in cerebrospinal fluid (CSF). We used two-dimensional gel electrophoresis (2-DE) to investigate the CSF proteome from 33 children with medulloblastoma and compared it against the CSF proteome from 25 age-matched controls. Protein spots were subsequently identified by a combination of in gel-tryptic digestion and MALDI-TOF TOF MS analysis. On average 160 protein spots were detected by 2-DE and 76 protein spots corresponding to 25 unique proteins were identified using MALDI TOF. Levels of prostaglandin D2 synthase were found to be 6 fold decreased in the tumor samples versus control samples (p<0.00001). This data was further validated using ELISA. Close examination of PGD2S spots revealed the presence of complex sialylated carbohydrates at residues Asn78 and Asn87. Total PGD2S levels are reduced 6 fold in the CSF of children with medulloblastoma most likely representing a host response to the presence of the tumor. In addition, our results demonstrate the feasibility of performing proteomic studies on CSF samples collected from patients at multiple institutions within the consortium setting.
One challenge inherent in brain tumor biology studies is the invasive nature of obtaining tissue. This fact constrains the biologic interrogation of a tumor to one point in time, eliminating the ability to serially monitor the biology of tumor progression or response to therapy. Blood is the next most desirable source of biomarkers primarily owing to its ready accessibility. However, the huge dynamic range of protein concentration, protein heterogeneity and the multiple environmental factors (such as diet, stress, illness, or physical activity) which can alter the blood proteome makes biomarker discovery in blood a daunting task [1, 2]. In contrast to blood and tumor tissue, CSF contains many fewer proteins, is intimate with brain tissue at many common tumor locations, is accessed by a standard lumbar puncture procedure and 25% of proteins present in CSF are brain-specific . Therefore, brain pathology may alter the CSF proteome; thereby making CSF a potentially valuable source of biological information about brain tumors . Where CSF biomarkers have been discovered (e.g. malignant germ cell tumors), they play an extremely valuable role in clinical management .
Although there have been a few 2-DE studies of brain tumor associated CSF [6–8], there are no published reports using this technique to study the CSF proteome of medulloblastoma, the most common malignant brain tumor in children . A vital point that is usually overlooked in many biomarker discovery studies is the inter-individual variability that exists in the CSF proteome independent of any disease condition .We have studied the variation in CSF protein abundances between samples and have taken this into account in our power calculations to determine the sample size that is required to identify putative biomarkers that are indicative of either the presence of tumor or a host response to the tumor. In the present study, we have applied a classic proteomic approach involving two dimensional gel electrophoresis (2-DE) and mass spectrometry and demonstrated that the CSF proteome is a valuable source of surrogate biomarkers for medulloblastoma. We have validated one of these candidate biomarkers, the prostaglandin D2 synthase after a thorough statistical analyses using ELISA.
CSF samples from 120 children newly diagnosed with a brain tumor were collected as a part of an IRB approved Pediatric Brain Tumor Consortium protocol (PBTC N-08) from 8 institutions in the USA (see acknowledgements). Informed consent was obtained from all participating subjects. When analysis began, CSF from 35 medulloblastoma patients had been collected through N-08, of which 29 had adequate protein content for 2-DE, demonstrating the feasibility of consortium based sample collection and processing. Four intra-institutional samples, processed in an identical manner, were added for a total sample size of 33. Typically, CSF samples were collected either at the time of tumor resection (ventricular samples) or 10–14 days following surgical resection (lumbar puncture) but prior to starting any radiation or chemotherapy treatment. Control CSF samples were obtained at Children’s National Medical Center from leftover samples drawn for other clinical purposes. Each collected CSF sample was immediately centrifuged at 3000 × g for 10 min to remove any cellular debris. Supernatant was then aliquoted into polypropylene tubes and stored at −20 °C until analysis. All samples were processed in the same manner. Samples collected from other institutions were also processed in the same manner and shipped to our laboratory on dry ice.
CSF aliquots containing 75 µg of total proteins from each sample were subjected to 2-DE using the Criterion set-up from Bio-Rad (Bio-Rad Laboratories, Hercules, CA). Briefly, samples were re-suspended in 180 µL of rehydration buffer (containing 8M urea, 2% CHAPS, 50mM DTT, 0.2% bio-lyte ampholytes, 2M thiourea and bromophenol blue) and isoelectric focusing was performed at 37 kilovolt hours using 11 cm IPG strips, pH 3–10. Second dimension separation was performed using 8–16% polyacrylamide gels. The 2-DE gels were stained with Bio-Safe coomassie stain (Bio-Rad Laboratories, Hercules, CA) and scanned using a GS-800 densitometer. Differential gel analysis was performed using the REDFIN software (Ludesi, Malmo, Sweden). For protein identification, spots from the 2-DE gel were excised, subjected to in-gel proteolysis using trypsin as previously described . The resulting peptides were extracted, dried in a vacuum centrifuge and re-dissolved in 10 µL of 1% aqueous tetrafluoro acetic acid (TFA) solution and desalted using C18 ZipTip micropipette tips (Millipore Corporation, Billerica, MA) according to the manufacturer’s user guide. Peptides were eluted from the ZipTip using 10 µL (acetonitrile) ACN/ TFA (70:30 by volume). A volume of 0.3 µL of the peptide solution was mixed with 0.3 µL of matrix solution (50mM HCCA in a 70ACN/30 TFA solution) and spotted on the MALDI plate. MS analysis was carried out on a 4700 ABI TOF TOF mass spectrometer (Applied Biosystems Inc, Foster City, CA). Please refer to the supporting information for detailed description of the 2-DE and MS analyses.
PGD2S protein spots were excised from the 2-DE gel in-gel tryptic proteolysis was carried out. To detect glycopeptides, the mass spectrometer was operated in linear positive mode and tuned for a mass range of 2000 to 10,000Da. Putative glycan moieties were assigned after entering the observed masses (with a mass accuracy of 1Da) in the GlycoMod tool (http://us.expasy.org/tools/glycomod). Neuraminidase treatment of the in-gel digested tryptic peptides was carried out to confirm the presence of sialic acid groups in these glycopeptides . Briefly, aliquots of tryptic peptides were dried in a vacuum centrifuge and redissolved in 50mM ammonium bicarbonate (pH 7.4). Aliquots were treated with neuraminidase (Calbiochem, San Diego, CA) to release the sialic acids present in the glycopeptides and MS spectra in linear mode were obtained as described above.
PGD2S measurements in CSF were performed using commercially available ELISA kits according to the manufacturer’s protocol (Cayman Chemicals, Ann Arbor, MI). CSF samples were diluted 1:2000. All samples and standards were run in duplicate with absorbance measured on the ThermoMax plate reader and data analyzed with SoftMax Pro v2.2.1 (Molecular Devices, Sunnyvale, CA).
REDFIN image analysis of 2-DE gels generates normalized values for pixel density representing relative protein quantity in each spot. The coefficient of variation (CV) was calculated to estimate inter-sample variability. Parametric analyses were performed using square root transformed normalized values. Student’s t test or ANOVA/ANCOVA was carried out for each spot to determine the statistical significance of the differences between the means for each group.
One criticism of clinical proteomic profiling experiments is the use of too few samples to draw reliable conclusions [13–15]. Very few biomarker studies have employed sample size calculations and false discovery rate (FDR) analysis to report meaningful results [16–19]. In this work, we used a pilot study comprised of CSF samples from 10 control and 10 medulloblastoma samples for our sample size and power calculations. We set the desired power to 80% and the desired FDR to 5%. From the pilot experiment it was estimated that approximately 20% of proteins were differentially expressed. Using the formulas provided by Benjamini and Hochberg  it was determined that a p-value threshold of 0.01 was required. A conservative estimate of the variation was used to ensure that the required sample size would not be underestimated. The coefficient of variation was calculated for proteins that were present in 80% of the gels. The required sample size was then calculated using a freeware Java applet  to determine that a sample size of 22 per group would be likely to give us 80% power to detect significant changes in proteins while keeping the false discovery rate around 5% (see supporting information).
75 µg of total protein from each CSF sample was used for 2-DE analyses. Representative gels from control and disease samples are shown in Figure 1. Though 2-DE is the most mature proteomic technology, one of its shortcomings is its high variability, especially when handling clinical samples . Additionally, as a result of the known significant variation between biologic samples from different individuals, we studied the extent of variation in CSF protein levels from 25 control individuals. For inter-sample normalization, the intensity value for a protein spot was divided by the total intensities of all spots present in the same gel resulting in an “abundance ratio” which was calculated for each spot. The extent of variability in spot abundances between samples was determined from the coefficient of variation. Only spots that were present in at least 80% of the gels (20 out of 25) were included in the CV analysis. Interestingly, there was not a significant correlation between abundance ratio and CV (p = 0.4330, r = −0.0505) (see supporting information).
Based upon the pilot project sample size estimates, we used 25 control and 33 medulloblastoma samples for 2-DE proteomic analysis. An average of 160 proteins spots were detected in our 2-DE gels of which 76 spots corresponding to 25 unique proteins were identified by MALDI MS analysis (see supporting information). We found that a total of 9 protein spots were altered between control and medulloblastoma (p<0.01) with a predicted false discovery rate of 5%.These 9 differentially expressed protein spots were subsequently identified as isoforms of 3 distinct proteins (Table 1).
Despite advances in 2-DE technology, obtaining consistent and reproducible 2-DE gels remains challenging. Given the observed inter-sample variation, it was necessary to discern the amount of variation attributable to technical factors. Control gels were run over a period of 15 months. In order to determine technical variability as a function of time, gels were divided into 3 groups based on the time of run and the CV was calculated for each of these groups. The average CV for the whole group of control sample gels was found to be 0.5 with the individual CV values for the three groups being 56%, 47% and 49% representing an acceptable level of variation (data not shown).
Three proteins, namely apolipoprotein E (apo E), apolipoprotein J (apo J) and prostaglandin D2 synthase (PGD2S) were differentially expressed in the CSF of medulloblastoma patients (see Figure 1 and Table 1). Acidic spots 14 and 22 correspond to the protein apo E which are both down regulated by about 2 fold in the medulloblastoma samples versus control samples. In contrast, spot 669 was identified as apo J and was increased by 3 fold in the tumor group. Six isoforms of PGD2S including three acidic, one neutral and two basic isoforms were also down regulated in the medulloblastoma CSF samples. Additionally, a 6.3 fold decrease in total PGD2S levels (p<0.00001) inclusive of all isoforms was observed.
A square root transformation of the values yielded a Gaussian distribution of the data set. Student’s t test was carried out to determine the statistical significance of the difference between the means for each group. One way ANOVA analysis was used to test the possible effect of patient age, histological subtype (desmoplastic, anaplastic, classical), clinical group (case versus control), presence of metastases (M0 versus M+) and extent of surgical resection on the decrease in the levels of PGD2S isoforms observed in the CSF of these medulloblastoma patients (see supporting information). Among these, clinical group was the only discriminating variable identified for PGD2S expression.
To study the effect of the presence of gross tumor in the alteration of PGD2S levels, the medulloblastoma samples were divided into two groups-one in which the tumor was totally resected and CSF samples collected by lumbar puncture (LP) 10–14 days after surgery and a second group consisting of samples collected at a time where the tumor was still present (not totally resected, or with metastatic disease, or ventricular CSF obtained at the time of surgery). There was no difference in the PGD2S spot intensities between the two groups (see supporting information). However, the intensity of spot 166 in the “no gross tumor” sample group was no longer significantly different from the controls.
To validate decreased PGD2S levels as a biomarker for medulloblastoma in CSF, we used a commercially available sandwich ELISA kit (Cayman Chemical, Ann Arbor, MI) to measure PGD2S levels in the same cohort. The ELISA assay using 17 medulloblastoma and 10 control samples corroborated the 2-DE finding showing a 7.9 fold reduction in total PGD2S in tumor samples after normalizing the values to total protein concentration (p<0.000002) (Figure 2). A smaller number of samples were used due to sample availability.
We detected 7 isoforms of PGD2S in our 2-DE gels with a spot pattern that is characteristic of glycosylation (Figure 1). Close examination of spots 12, 2 and 4, the most abundant isoforms revealed a number of peaks in the higher mass range (2000–8000 Da) most likely corresponding to glycopeptides (Figure 3). Of particular interest were the ions at m/z values of 4163, 3960, 3870 and 3667 that differed by a mass of 290 Da, equivalent to one sialic acid molecule. After neuraminidase treatment, the peaks at m/z 4163 and 3960 disappeared confirming the presence of sialic acids in these two glycopeptides. We performed GlycoMod analysis using the observed molecular masses obtained in the linear mode to predict putative structures for the glycans and our data suggests the presence of complex glycans at Asn78 and Asn87 (Table 2). Examination of the amino acid sequence of PGD2S revealed three possible N-linked glycosylation sites (identified by the motif N-X-S/T/C) corresponding to amino acid positions 51, 78 and 87. Notably, the putative glycosylation we detected at Asn87 was not reported in an earlier work by Hoffmann, et al  exploring the glycosylation sites of PGD2S. GlycoMod analysis predicted hybrid type carbohydrate structures composed of 2 N-acetylglucosamine (GlcNAc), 3 mannose (Man), 3 GlcNAc, 2 galactose (Gal) and 1 N-acetylneuraminic acid (NeuNAc, also known as sialic acid) for the peptide (67–86) bearing the glycosylation site at Asn78 while the carbohydrate chain linked to Asn87 in the peptide (87–92) consisted of 2 GlcNAc, 3 Man, 4 GlcNAc, 3 gal, 1 fucose (Fuc) and three NeuNAc (see supporting information). All three spots had glycopeptide signature at m/z 4163, 3960, 3870 and 3667 but with different relative intensities.
In this study, we have identified PGD2S as a candidate biomarker for pediatric medulloblastoma after studying the extent of variability that exists in the CSF proteome. Protein abundance ratios and associated CV values were calculated to estimate the degree of CSF protein heterogeneity. In addition, the abundance ratios of proteins varied largely in both control and tumor groups in agreement with the high dynamic range reported in other studies [22, 23]. We identified altered levels of apo E, apo J and PGD2S in the CSF of medulloblastoma patients. Alterations in the levels of apo E have been reported in glioblastoma and various other cancers including pancreatic and lung [24–26]. On the other hand, up-regulation of apo J, also known as clusterin, has been suggested to play a role in cancer progression by conferring a protective role against apoptosis in cancer cells [27, 28]. Similar to our finding, Pucci et al reported over expression of apo J in the cytoplasm of primary colon cancer cells that is released into the extracellular space .
The most intriguing finding in our study is the alteration of prostaglandin D2 synthase isoforms (PGD2S), a glycoprotein that is abundant in CSF. PGD2S, also known as β- trace, is one of the most abundant glycoprotein in human CSF ; it is synthesized and secreted by both glial cells and the choroid plexus . An ideal biomarker is one that is at least expressed in higher amounts (if not exclusively ) in the CSF of children with tumors compared to controls, in which case, it is logical to speculate that the tumor cells are the source of that protein. Conversely, since PGD2S is reduced in tumor CSF samples, we speculate that the reduction is a host response to the presence of the tumor. Consistent with this hypothesis is the fact that PGD2S abundance did not vary depending upon disease burden (e.g. between tumors with metastases and completely resected tumors) even when measured in samples obtained 10–14 days after tumor resection. One potential advantage of a biomarker derived from a host response is that it can be amplified, analogous to the way that serum antibody levels created by an immune response can indicate previous exposure to an infectious agent. Although PGD2S has been investigated in various diseases, only two studies quantitated PGD2S levels in CSF [32–36]. Saso et al reported reduced levels of PGD2S in the CSF of brain tumor patients; however, the study did not specify the types of tumors that were investigated . Our study differs in its specific focus on medulloblastoma and the pediatric population. The other report by Huang et al showed a reduced level of PGD2S in the CSF of patients with acute inflammatory demyelinating polyneuropathy . There is still very limited knowledge on the mechanisms involved in the alteration of PGD2S levels in CSF. The known established function of PGD2S is the synthesis of prostaglandin PGD2 from PGH2. Therefore, we quantitated levels of PGD2 in our CSF samples using ELISA but found no correlation to PGD2S quantities in CSF samples from tumor and control subjects (data not shown) most likely due to its instability in biological fluids and rapid degradation to the 15-deoxy prostaglandin J2 metabolite .
We performed statistical analysis to investigate confounding variables that could affect the proteome. Age was not found to be a significant covariate in the comparative analysis of PGD2S spot intensities between control and tumor samples. Likewise, there was no significant variation based upon the presence of metastases or histological subtypes (see supporting information). None of the PGD2S isoform levels were different between CSF samples associated with gross residual disease and those associated with completely resected disease. This observation could suggest that either these spots are sensitive enough to indicate the presence of minimal residual disease after surgical removal of the tumor or that the turnover of PGD2S in the CSF requires more than 2 weeks. The remaining, though unlikely, possibility is that PGD2S levels never return to normal levels. Further work with serial samples will be required to determine the true explanation.
Because neither age, histological subtype, effect of surgery nor presence of metastases correlated with the intensity of the PGD2S spots, the observed alteration in PGD2S was not attributed to any factor other than disease status. Furthermore, the consideration of blood contamination was rendered moot by the fact that PGD2S is a protein. Proteins that exhibit great inter-individual variability make poor candidate biomarkers for disease. As depicted in the supporting information, although the CV values of individual spots attributable to inter-individual variation showed no dependency on spot abundance ratios, average CV was found to be 0.7. In other words, a typical CSF protein varies between different individuals by about 70%. Knowledge about inter-individual proteome variations in human samples (both tissue and biological fluid) is currently limited with only two published reports on this topic. The first study reported an average CV of 0.18 within the platelet proteome . In the second, Hu et al conducted both intra and inter-individual variability studies on the CSF proteome and though they reported that the inter-individual variability is much more than the intra-individual variation, they did not report the average CV . Given the high variability that exists in the CSF proteome, an understanding of the contribution of technical variation is also critical for differential proteomic investigations. In our hands, the contribution of technical variability to the observed total variability was 9%, well within the anticipated values for 2-DE .
In addition to the decrease in total PGD2S levels, there was a marked difference in the levels of various isoforms between patients and controls. In particular, there was at least a 2 fold reduction of the acidic isoforms of PGD2S in medulloblastoma samples (see Table 1). This prompted us to identify the glycosylation patterns in the various isoforms of PGD2S. Since the linear MS spectra for all three spots exhibited pairs of peaks with masses of 4163 and 3870 as well as 3960 and 3667, which differ in mass by 292 Da (the mass of a sialic acid molecule), we hypothesized that all three spots contained sialylated glycopeptides but different number of sialic molecules. This was confirmed by neuraminidase treatment (see Figure 3). We were able to detect only two of the three predicted glycopeptides from PGD2S, one spanning amino acid sequence 67 to 86 and the other starting at 87 and ending at 92. We speculate that these PGD2S isoforms are a result of differential glycosylation patterns of the three predicted sites in the protein. Further analysis of purified PGD2S using accurate mass measurements is required to fully characterize its glycosylation pattern in CSF but such an analysis is beyond the scope of this clinical study. In conclusion, the observations described herein indicate that PGD2S is a CSF biomarker for medulloblastoma, which could be useful in detecting the efficacy of treatment and the recurrence of disease.
The authors acknowledge the support of James M. Boyett PhD, Executive Director of the Operations and Biostatistics Center for the PBTC, Larry E. Kun MD, Chair of the PBTC, Dana Wallace MS, Stacye Richardson MS, Stewart Goldman MD, Ian Pollack MD, and Andreas Ekefjard of Ludesi. This work was supported in part by NIH grant U01 CA81457 for the PBTC, the American Lebanese Syrian Associated Charities, The Childhood Brain Tumor Foundation, The Becca’s Run Fund, The Schlobohm Family, and Friends of Ian. This work was also partially supported by NIH core grants 1P30HD40677 and 5R24 HD050846.
Conflict of interest statement
The authors declare that they have no financial or commercial conflicts of interest.