|Home | About | Journals | Submit | Contact Us | Français|
Amyotrophic lateral sclerosis (ALS) is characterized by degeneration of motor neurons. We tested the hypothesis that proteomic analysis will identify protein biomarkers that provide insight into disease pathogenesis and are diagnostically useful. To identify ALS specific biomarkers, we compared the proteomic profile of cerebrospinal fluid (CSF) from ALS and control subjects using surface-enhanced laser desorption/ionization-time of flight mass spectrometry (SELDI-TOF-MS). We identified 30 mass ion peaks with statistically significant (p < 0.01) differences between control and ALS subjects. Initial analysis with a rule-learning algorithm yielded biomarker panels with diagnostic predictive value as subsequently assessed using an independent set of coded test subjects. Three biomarkers were identified that are either decreased (transthyretin, cystatin C) or increased (carboxy-terminal fragment of neuroendocrine protein 7B2) in ALS CSF. We validated the SELDI-TOF-MS results for transthyretin and cystatin C by immunoblot and immunohistochemistry using commercially available antibodies. These findings identify a panel of CSF protein biomarkers for ALS.
Amyotrophic lateral sclerosis (ALS) is the most common adult motor neuron disease, affecting one in every 40 000 individuals (Jackson and Bryan 1998). It typically affects individuals in their mid-50s and is characterized by rapidly progressive degeneration of motor neurons in the cerebral cortex, brainstem and spinal cord. The median survival in ALS is three to five years (Jackson and Bryan 1998; Cleveland and Rothstein 2001).
ALS exists in both sporadic and familial forms. Familial ALS (FALS) comprises only 5–10% of all ALS cases. To date five genes have been implicated as causes of FALS (Bruijn et al. 2004). The most common is the gene for Cu/Zn cytosolic superoxide dismutase (SOD1). Transgenic mice expressing SOD1 with point mutations that cause human ALS develop an age-dependent, progressive motor weakness that mimics human ALS (Rosen et al. 1993; Gurney et al. 1994; Wong et al. 1995). By contrast with these rare types of FALS, the pathogenesis of sporadic ALS (SALS) is poorly understood (Bruijn et al. 2004). Many lines of investigation have explored the pathogenesis of ALS (Rosen et al. 1993; Van Westerlaak et al. 2001; Subramaniam et al. 2002; Ranganathan and Bowser 2003). However, no definitive causes of SALS have emerged. There is only one FDA approved drug, riluzole, which has a modest effect on survival (Riviere et al. 1998; Kriz et al. 2003).
Many neurodegenerative diseases, including ALS, exhibit cytoplasmic and/or nuclear protein aggregates that likely participate in disease pathogenesis. In ALS, changes in protein composition of the cerebrospinal fluid (CSF) or serum may denote corresponding alterations in protein expression, post-translational modifications or turn-over within the tissue of the central nervous system (CNS) (Rohlff 2000). Because CSF contains proteins and protein fragments released from ALS-affected neurons and glia, it seems likely that profiles of CSF proteins may serve as biomarkers for the process of motor neuron degeneration in the spinal cord in this disease. Indeed, the identification of ALS specific protein biomarkers may provide insight into the nature of the degenerative process. Additionally, such biomarkers may be useful both in the diagnosis of this disease and in monitoring the response of the degenerative process to therapeutic interventions.
Proteomic analyses have been used to uncover biomarkers in other CNS disorders and neurodegenerative diseases such as multiple sclerosis, schizophrenia, Alzheimer's disease (AD), and HIV-1 associated cognitive impairment (Choe et al. 2002; Tsuji et al. 2002; Allen et al. 2003; Luo et al. 2003; Dumont et al. 2004). 2D gel electrophoresis has revealed protein alterations in ALS patients and the SOD1 mouse model of ALS (Wiederkehr et al. 1989; Perry et al. 1990; Smith et al. 1998; Jacobsson et al. 2001; Strey et al. 2004). Mass spectrometric (MS) analysis of the effects of G93A and G37R mutant SOD1 expressed in a motor neuron-like cell line disclosed alterations in proteins associated with antioxidant defenses, proteasome machinery and nitric oxide metabolism (Allen et al. 2003). Small molecule and lipid markers of ALS have also been analyzed; thus, recent studies have identified increased serum and CSF levels of lipid peroxidation in ALS patients, and increased levels of ceramides, sphingomyelins and cholesterol esters in spinal cords of ALS patients and transgenic mice using MS/MS techniques (Cutler et al. 2002; Simpson et al. 2004). However, while biochemical assays using CSF from ALS patients have identified molecular alterations in ALS, these studies have not yet produced a sensitive and specific biomarker(s) for this disease. A recent study using liquid chromatography-Fourier transform ion cyclotron resonance mass spectrometry revealed CSF protein patterns in ALS patients, though no protein identities were reported and the predictive value for this pattern remains unclear (Ramstrom et al. 2005).
We report a proteomic profile of CSF from recently diagnosed ALS patients and control subjects using surface-enhanced laser desorption/ionization time-of-flight mass spectrometry (SELDI-TOF-MS). CSF was obtained from ALS patients on average 385 days from symptom onset. Using two Ciphergen Biosystems ProteinChip arrays and computer-assisted algorithms, we identified disease specific protein peaks. We report statistically significant changes in 30 mass spectrometric signals (p < 0.01), and identified biomarker panels that accurately differentiate ALS from control subjects. Finally, we determined the protein identity of three ALS biomarkers with high diagnostic predictive value and further validated these findings using antibody-based techniques on ALS and control subjects. To our knowledge, this is the first study to identify a panel of predictive protein biomarkers in CSF of ALS patients.
The total number of subjects used for mass spectrometry was 23 ALS and 31 controls. The initial training group included 21 control subjects and 15 patients with a recent clinical diagnosis of ALS by board certified neurologists specializing in motor neuron diseases. The average time from clinical onset (date when patient retrospectively reported initial symptoms) to when CSF was obtained was 385 days (113–730 days) for ALS subjects. This represents the typical time required for ALS patients to obtain a clinical diagnosis from the time of symptom onset. Fourteen of the ALS subjects were sporadic cases and one familial non-SOD1 subject was included. The average ages of the ALS and control cohorts were 49.6 ± 3.4 and 45.2 ± 3.4, respectively. The control group included the following: 12 healthy subjects with no neurologic complaints; one with metabolic myopathy; two each with neuralgia and neuropathy; one case each of meningitis, demyelinating disorder, one slowly progressing atypical motor neuron disease and a probable AD. We also used a separate test group of 18 coded subjects to predict disease status based on rules generated using the training group. This test group included 10 controls (average age of 45.8 ± 5.9 years) and eight ALS subjects (average age of 52.8 ± 4.6 years and average time from symptom onset to CSF draw of 504 days). The control group included four healthy subjects, two cases of demyelinating disorder, one neuropathy, one Lymes disease, one meningitis, and one stroke. A second test group of 12 subjects included seven controls (average age of 49.2 ± 4.4 years) and five ALS subjects (average age of 52.5 ± 6.4 years with an average time from symptom onset to CSF draw of 392 days). The control group included three healthy subjects, two cases of demyelinating disorder, and two neuropathy subjects.
Cerebrospinal fluid (CSF) was obtained by lumbar puncture, immediately centrifuged at 450 g for 5 min at 4°C to remove cellular debris, aliquoted, frozen at −80°C and thawed on ice prior to use. 2D-Quant kit (Amersham, Piscataway, NJ, USA) was used to determine protein concentrations (0.06 μg/μL to 0.6 μg/μL for each CSF sample). University of Pittsburgh Institutional review board (IRB) and Massachusetts General Hospital IRB approved informed consent for this procedure was obtained from all subjects.
Experiments were first performed to optimize the Ciphergen ProteinChips (Ciphergen Biosystems, Inc., Palo Alto, CA, USA) and binding conditions. Two protein chips, SAX2 and IMAC exhibited the best spectral data and were utilized for further analyses. The SAX2 chips were equilibrated with 100 mm Tris-HCl pH 8.5 and then fractionated samples were added to the spots (one sample per spot). The IMAC arrays were treated with 100 mm zinc sulfate followed by washing with 50 mm sodium acetate. These were then repeatedly washed with HPLC-grade water (Sigma, St Louis, MO, USA) and phosphate-buffered saline. CSF samples (2.2 μg) were diluted in 1% trifluoroacetic acid to a final trifluoroacetic acid concentration of 0.1%. Samples were fractionated using C4 ZipTip pipette tips (Millipore, Bedford, MA, USA) according to the manufacturer's instructions and eluted onto the spots using 90% acetonitrile. A saturated matrix solution of 4-hydroxy-α-cinnamic acid was then added to the spots. The spots were dried at 23°C before acquiring spectra using the Ciphergen ProteinChip Reader (PBSIIc; Ciphergen Biosystems). The spectra of proteins/analytes were generated using a laser intensity range of 179–185 and a detector sensitivity range of 7–9 with a mass deflector setting of 1000 Da for the low mass range (1–20 kDa). These settings were kept constant for all chips in each experiment. For individual experiments all CSF samples were analyzed in duplicate (two separate spots on two different ProteinChips) to demonstrate reproducibility of the results, and each experiment was repeated three times.
In each experiment, one standard CSF sample was used as an internal control to measure variability of the mass spectra between chips. The coefficient of variance (CV) for four-selected mass to charge (m/z) signals was less than 25%. External calibration of the ProteinChip Reader was performed using the Ciphergen Biosystems 5-in-1 peptide mix (human angiotensin 1, 1296.5 Da; porcine dynorphin A, 2147.5 Da; human ACTH, 2933.5 Da; bovine insulin, 5733.6 Da), bovine ubiquitin (8564.8 Da)] in which horse cyto-chrome c (12360.2 Da) and horse myoglobin (16951.5 Da) were included.
Total ion current of all profiles was used to normalize each of the spectrograms. The Ciphergen 3.1 Biomarker Wizard application autodetected mass peaks by clustering and analyzed the output using non-parametric Mann–Whitney statistical analysis, which constituted the univariate analysis of the data. Peak labeling was performed using second-pass peak selection with a signal to noise ratio of 1.5.
To uncover putative diagnostic biomarkers, mass ion peaks were analyzed using the Rule Learner (RL) algorithm. RL was first used for predicting mass spectra of complex organic molecules (Feigenbaum and Buchanan 1993). RL creates and searches possible rules by successive specialization, guided by spectral data in a training set and by prior knowledge about the data (e.g. clinical diagnosis or symptoms) to define diagnostic biomarkers (Provost and Buchanan 1995). For this study we used both raw and normalized spectral data. The data were normalized by scaling each mass ion intensity to the interval [0, 1]. RL learns predictive patterns by starting with a single mass peak and adding one additional peak at a time to create partial rules (essentially IF-THEN statements) that look most promising at classifying a subject as control or ALS. Each partial rule is matched against the training cases to see how many of the positive cases are correctly predicted and how many of the negative cases are incorrectly predicted, thus providing statistical guidance to the search. Due to the large amount of data (greater than 35 000 m/z data points), RL was run separately on 10 almost equal subsets of the data to pick representative biomarkers from each subset. In order to allow these biomarkers to compete among one another, we performed a final run of RL with the union of such features selected from each of the previous runs. A three-fold or five-fold cross-validation is utilized to tune the RL input parameters during the training, which produces the final set of rules (i.e. predictors). Evidence gathering is performed using ‘weighted voting’ wherein each coded sample is assigned a group (ALS or control) by the casting of a ‘weighted vote’ (the number of training samples that defines a particular rule). Therefore a particular subject may be classified as ALS even if it has one rule that predicts control. The predictions are then compared to known clinical diagnosis to determine sensitivity and specificity.
Samples (500 μL CSF) were fractionated by gravity flow using Q HyperD F matrix (Ciphergen Biosystems) in Biospin columns (BioRad Laboratories, Hercules, CA, USA) for anion exchange fractionation. The columns were first equilibrated using 50 mm Tris-HCl at pH 9. Fractions were collected using buffers at varying pH of 9 (20 mm Tris-HCl/0.1% Triton X-100), pH 7 (50 mm HEPES/0.1% Triton X-100), pH 5, and 4 (100 mm sodium acetate/0.1% Triton X-100) and pH 3 (50 mm sodium citrate/0.1% Triton X-100). Finally the column was washed with a 33% isopropanol/17% acetonitrile (ACN)/0.1% trifluoroacetic acid solution (organic fraction). A small amount of these different fractions were spotted on SAX2 and NP20 (normal phase) chip to confirm the presence and purity of relevant spectral peaks. The aliquots of 50–500 μL were concentrated using YM3 or YM10 filtration devices (Millipore). The resulting fractions were electrophoresced by sodium dodecyl sulfate–polyacrylamide gel electrophoresis. Excised bands were incubated with 200 μL of 40% methanol/10% acetic acid solution to remove sodium dodecyl sulfate. After incubation with 200 μL of ACN, the gel pieces were dried and proteins eluted in 20 μL of 50% formic acid/20% ACN/15% isopropanol solution for 2 h with strong agitation. A small amount of the eluted proteins was spotted onto a NP20 chip and the rest were dried in a speed-vacuum. The protein pellet was then re-hydrated in 10 μL of 25 mm ammonium bicarbonate (pH 8) containing 0.02 μg/μL of sequencing grade modified trypsin (Promega, Madison, WI, USA) and incubated at 37°C for 16 h. The tryptic digests (1–2 μL eluates) were tested again on NP20 chip surface and then applied for peptide sequencing using a QSTAR tandem MS with a ProteinChip interface (Applied Biosystems Inc., Foster City, CA). Peptide and amino acid sequences were compared to ProFound protein databases to confirm identity.
A total of 58 subjects, distinct from those used for mass spectrometry, were used for biomarker validation. Biomarkers were validated using western immunoblotting and immunohistochemistry using separate subject sets. For immunoblot analysis, a total of 34 subjects, including 20 coded test samples, were used for analysis. fourteen CSF samples were from subjects used in the mass spectroscopy analysis and used to confirm the mass spectral data. Anti-rabbit polyclonal antibodies against cystatin C (DAKO, Carpinteria, CA, USA) and transthyretin (TTR) (DAKO) were used for these studies. Immunoblotting for cystatin C and TTR was performed using a separate cohort of recently diagnosed ALS and control subjects. Twenty-five or 50 μg of CSF protein from controls (n = 17, average age of 58.5 years) and ALS (n = 17, average age of 54 years) subjects were electrophoresed on 10–20% Tris-Tricine Ready Gels (Bio-Rad Laboratories) and transferred to a polyvinylidene difluoride membrane. The control group included six healthy subjects, two multiple sclerosis, two lyme disease, two normal pressure hydrocephalus, one dementia, one epilepsy, one myopathy, one meningitis, and one neurofibromatosis subjects. The average time between ALS symptom onset and CSF draw was 466 days. The primary antibodies were used at a dilution of 1:500. For protein confirmation, we loaded 10 ng of purified human cystatin C protein (Calbiochem, San Diego, CA, USA) or 50 ng of TTR (Biodesign, Saco, ME, USA) into separate gel lanes.
Immunohistochemistry for TTR and cystatin C by light microscopy was performed with paraffin-embedded lumbar spinal cord sections from archived postmortem tissues from the University of Pittsburgh ALS Tissue Bank as previously described (Ranganathan and Bowser 2003). Neuropathologic assessment confirmed the clinical diagnosis of ALS or the lack of any central nervous system abnormalities within the control subjects. Healthy controls (n = 8) and ALS (n = 16) cases were probed with anti-rabbit polyclonal antibodies (DAKO, Carpentaria, CA, USA) at a concentration of 1:300 (TTR) or 1:1000 (cystatin C). All sections were immunostained simultaneously and examined in a coded manner by two independent investigators. Controls included tissue sections lacking either primary or secondary antibody. The average age and postmortem interval (PMT) was 62 years and 7 h PMT for the control subjects (six male, two female) and 57 years and 6.5 h PMT for ALS subjects (nine male, seven female). The average time from diagnosis till autopsy for ALS subjects was 25 months (range of 12–50 months).
We initially used a small set of ALS (n = 4) and control (n = 4) CSF samples to optimize our methodology for reliably detecting CSF proteins. SELDI-TOF-MS was used to characterize CSF analytes of ALS and control subjects. We initially compared CSF spectra using four different Ciphergen Proteinchip binding surfaces: H4, WCX2, SAX2 and IMAC30. The most consistent and robust spectral peak patterns were obtained using the SAX2 and IMAC30 chips, and therefore all subsequent experiments were performed using these two chip surfaces. By comparing spectra from the same CSF samples analyzed on SAX2 and IMAC chips during an interval of multiple months, we determined that adequate reproducibility between experiments was achieved (data not shown). We also assayed a standard CSF sample on one spot of all chips within an experiment to measure the coefficient of variation within each experiment. Having established the reproducibility of the spectral data, we analyzed CSF samples from our study subjects.
We next used a training set of ALS (n = 15) and control (n = 21) CSF samples to detect protein peaks that are statistically different in ALS CSF. Direct comparison of spectra from SAX2 or Zn-IMAC30 chips between control and ALS subjects revealed very similar protein profiles (Fig. 1) with discernible differences in some signal intensities (Fig. 1, bottom panels). Ciphergen clustering software (BioWizard version 3.1) autodetected a total of 366 peaks (207 on SAX2 and 159 on Zn-IMAC30 chip) in the training group (n = 36). We used this software to perform univariate analysis of both the SAX2 and Zn-IMAC30 datasets (366 peaks) to identify CSF spectral peaks that exhibited statistically significant relative peak intensities (Fig. 2). We identified 15 protein peaks each from the SAX2 (Fig. 2a) and the Zn-IMAC30 (Fig. 2b) datasets; these represent approximately 8% of total peaks (366) identified by the BioWizard software and each had a statistically significant difference (p < 0.01) in peak intensity (either increased or decreased relative abundance). One peak at 13.78 kDa was common to both the chip types and exhibited decreased peak intensity in ALS vs. control cases (Fig. 2). Furthermore, we identified 22 additional signals with statistically significant difference in signal intensities with p-values ≤ 0.05 (data not shown). The 6.88 kDa peak in Fig. 2(b) is a double charged species of the 13.78 kDa peak. Therefore, our univariate analysis identified a total of 52 signals (14.2%) with significant alterations in peak intensity between control and ALS subjects, thus representing putative biomarkers.
Having identified 52 protein signals that are statistically different in ALS CSF, we next sought to define combinations of signals (panels) with a high predictive value for ALS. The Rule Learner (RL) algorithm was used to create biomarker panels that can be directly applied to test subjects for diagnostic predictions. Using a training set of 34 subjects, RL created a series of learned rules using baseline corrected mass spectral data containing approximately 37 000 m/z data points from each of the SAX2 and IMAC ProteinChips (Provost and Buchanan 1995; Gopalakrishnan et al. 2004). RL identified the following m/z peaks as putative biomarkers that predicted ALS with 100% accuracy in the training set: SAX 3.70 kDa, SAX 7.78 kDa, SAX 11.52 kDa, IMAC 3.01 kDa, IMAC 8.61 kDa, IMAC 8.93 kDa, IMAC 9.09 kDa, IMAC 13.38 kDa, IMAC 16.59 kDa, IMAC 17.08 kDa (Table 1). RL next applied these rules to make disease predictions in 20 coded test subjects that included four healthy subjects and six neurologic disease controls (see Methods). RL correctly identified eight of 10 ALS subjects (RL provided no diagnostic prediction for one of the ALS subjects) and six of 10 control subjects. The diagnostic predictive values for this biomarker panel in the coded test group were: 80% sensitivity, 60% specificity and 74% accuracy. The overall coverage was 95% (19 of 20), which refers to the percentage of total predicted cases (ALS + Controls) excluding the ‘no prediction’ case, and the positive predictive value for ALS was 89%. ALS subjects in the training set used to create the biomarker panel had an average time from symptom onset until CSF draw of 385 days, suggesting that our biomarker panel is most predictive for subjects within approximately one year from clinical symptom onset.
Next, we performed an independent, re-analysis of the spectral data from these 54 samples with RL using a larger training set size. This was performed to identify additional biomarkers with high predictive value that may increase accuracy of the biomarker panel. We discarded two subjects who appeared to confound data analysis: an atypical metabolic myopathy subject who exhibited a poor quality mass spectrum in all replicate samples and a slowly progressing atypical motor neuron disease subject with a CSF draw approximately 10 years after symptom onset. The remaining 52 subjects were randomly split into a training set (N = 40) and test set (N = 12). The CSF spectra for the test group were not previously seen by RL during the training phase. RL was trained iteratively 12 times to find significant peaks and generate biomarker panels. The best biomarker model contained nine additional peaks with significant predictive value that achieved 100% accuracy for predictions of samples within the training set: SAX 2.40 kDa, SAX 2.46 kDa, SAX 6.88 kDa, SAX 12.28 kDa, SAX 13.65 kDa, SAX 13.67 kDa, IMAC 3.42 kDa, IMAC 7.02 kDa, IMAC 12.08 kDa (Table 1). Using these m/z features we attained 80% sensitivity, 100% specificity and 91% accuracy for diagnostic predictions of the coded test samples (11 of 12 test subjects accurately categorized by RL). One ALS subject within this small test group was incorrectly predicted, resulting in the drop in sensitivity. Although RL failed to identify any individual biomarker peaks that provided accurate predictions on the training or test groups, the combination of the 3.42 and 6.88 kDa peaks achieved a predictive accuracy of 80% for spectra of the training set (p = 0.045) with 80% sensitivity and 71% specificity for the 12 coded test subjects (data not shown).
We next determined the protein identity for RL biomarker peaks with high diagnostic predictive value, which included the 3.42, 6.88 and 13.38 kDa peaks. As noted above, the 6.88 kDa peak is a double-charge species of the 13.78 kDa mass peak. Peaks with high diagnostic predictive value were identified by individually removing peaks from the spectra and then re-analyzing with RL to evaluate any drop in sensitivity and specificity, as well as determining which combination of individual peaks provided the highest level of diagnostic accuracy when applied to the testing group (data not shown). These three m/z peaks were enriched from CSF using anion exchange chromatography (see Methods). SELDI-TOF-MS was used to monitor column fractions for the presence of specific mass peaks. Fractions containing mass peaks of interest were then separated by sodium dodecyl sulfate–polyacrylamide gel electrophoresis and bands of interest subjected to in-gel tryptic digestion followed by mass spectrometry. Peptide mass fingerprints were obtained and matched to the ProFound protein database. To confirm the protein identity, tryptic fragments for each biomarker were analyzed by tandem mass spectrometry (MS/MS) to obtain amino acid sequence information. For the 3.42 kDa peak we performed an on-chip enrichment using C4 Zip-tip fractionation and direct MS/MS amino acid sequencing of tryptic fragments. We identified the 3.42 kDa peak as a carboxy-terminal fragment of the neuroendocrine protein 7B2 (7B2CT), the 13.38 kDa peak as cystatin C, and the 6.88 (13.78) kDa peak as a monomer of TTR (Table 2). Although the obtained sequence coverage for 7B2 appears low (Table 2), the peptide sequences were mapped to the full-length 7B2 precursor protein. The 12 sequenced 7B2 tryptic peptides actually covered 100% of the 7B2CT protein sequence.
To validate our findings we obtained commercially available antibodies to TTR and cystatin C and performed immunoblot and immunohistochemistry with separate cohorts of age-matched control and ALS subjects. Antibodies to 7B2CT were not available for these studies and will be generated for future investigation. CSF samples used for immunoblot (17 control and 17 ALS subjects) included 14 subjects used for mass spectrometry and 20 coded subjects that were not used in prior mass spectrometry experiments. Control subjects included both healthy individuals and neurologic disease controls (see Methods). For the ALS subjects, the average time from symptom onset to CSF draw was 466 days. Results demonstrated that the level of the 13.78 kDa TTR monomer was significantly reduced in the CSF of most ALS subjects as compared to controls and confirmed our mass spectrometry results (Fig. 3). The 55 kDa homotetramer form of TTR was also reduced in ALS subjects. Using a set of 20 coded test subjects, TTR protein levels as measured by immunoblot predicted ALS with 70% sensitivity and 60% specificity. Cystatin C protein levels were also reduced in the CSF of many ALS subjects (Fig. 3), though both the sensitivity and specificity for predicting ALS using this one biomarker was reduced relative to TTR.
Lumbar spinal cord tissue samples from eight healthy controls and 16 ALS subjects were immunostained for TTR or cystatin C (See Methods for description of cases). TTR was observed in a punctate staining pattern within the cytoplasm of motor neurons in control subjects (Figs 4a and b). Reduced levels of TTR were often observed in motor neurons that remained in the spinal cords of ALS subjects (Figs 4c and d). The TTR positive punctate structures in Fig. 4 not contained in motor neuron cell bodies likely represent ubiquitin-positive punctate structures commonly seen in the CNS of the aged and are not linked to any specific disease (Dickson et al. 1992). The cystatin C antibody labeled Bunina bodies in motor neurons of ALS subjects as previously reported (Okamoto 1993).
In this study we have used SELDI-TOF-MS to profile CSF and identify biomarkers for ALS. We report 30 spectral peaks with statistically significant differences in peak intensities between ALS and control subjects (p < 0.01). RL identified 10 m/z peaks from a training set of 24 subjects that predicted ALS disease status in coded test subjects (N = 20) with 80% sensitivity, 60% specificity and 74% accuracy. Increasing the training set to 40 subjects revealed nine additional m/z peaks that increased the specificity to 100% and accuracy to 91% for a separate testing group of 12 subjects. We have determined the identity of three biomarker peaks that RL analysis recognized as having high diagnostic predictive value; as compared to control CSF, two were decreased (TTR and cystatin C) and one was increased (the carboxy-terminal fragment of the neuroendocrine protein 7B2) in ALS CSF. The mass spectrometry results for TTR and cystatin C were confirmed using immunoblot and immunohistochemistry on two separate and distinct cohorts of ALS and control subjects. A total of 112 subjects (57 ALS and 55 healthy and neurologic disease controls) collected at two medical centers were used in this study for biomarker discovery and validation.
We note that SELDI-TOF-MS permits only a limited analysis of proteins of high molecular mass and therefore additional biomarkers that were not identified in this study may be present within this mass range. Additional studies with a larger cohort of control and ALS subjects are necessary to both confirm and validate these initial findings and further establish the disease specificity for the biomarker panel. With increased sample size, our studies may also be extended to distinguish a spectrum of motor neuron diseases.
A diagnosis of ALS is typically made after exhaustive clinical tests over many months to eliminate other potential causes for the presenting symptoms, and patients often exhibit symptoms for months prior to seeking medical evaluation. A panel of predictive biomarkers will aid in a more rapid clinical diagnosis, permit initiation of therapy at or near the onset of clinical symptoms, and avoid unnecessary or improper treatment interventions. It has been reported that up to 10% of ALS diagnoses are false-positive and up to 44% may be false-negative (Brooks 1999). Therefore a rapid and accurate diagnostic test would be quite beneficial for ALS patients and families. For this study we used CSF samples obtained from subjects during initial clinical diagnostic evaluation or at the time of clinical diagnosis to identify biomarkers near the time of disease onset. However CSF samples were not included in this study until a definitive clinical diagnosis was complete. Of the healthy control subjects misclassified, one subject had leg stiffness with an unknown diagnosis (though multiple sclerosis has been ruled out), and another experienced arm numbness with a family history of neurodegenerative disorders. The biomarker panel failed to correctly predict ALS for two additional test subjects with a time from symptom onset to CSF draw of 1339 and 3913 days (data not shown). This suggests that the biomarker signature pattern may change during disease progression. Longitudinal studies are required to examine how the proteomic biomarker signature pattern changes during disease progression within individual ALS patients.
RL analysis confirmed that no individual biomarker peak provides accurate diagnostic predictions in our training or coded test groups. However RL determined that using only the 7B2CT (3.42 kDa) and TTR (6.88 kDa) biomarker peaks provided 80% sensitivity and 72% specificity for ALS in the small coded test group (N = 12). These data support our contention that a panel of biomarkers is required for diagnostic predictions with greater than 90% accuracy. Overall, no individual or pair-wise combination of biomarkers provides diagnostic predictions with the accuracy of the complete biomarker panel, though the combination of 3.42, 6.88 and 13.38 kDa biomarker peaks exhibited significant diagnostic predictive value and provided the basis for determining the protein identity of these peaks.
The protein level of 7B2CT (3.42 kDa) increases in ALS, whereas levels of both TTR (13.78 kDa) and cystatin C (13.38 kDa) decrease. Human 7B2 protein is localized to the secretory granules of neurons (including motor neurons) and endocrine cells (Marcinkiewicz 1993; Marcinkiewicz et al. 1994). It interacts with proprotein convertase 2 (PC2) within the trans-Golgi network and aids in the maturation of pro-PC2. Mature PC2 then catalyzes the conversion of hormone and neuropeptide precursors into their active forms. In addition, 7B2 has been shown to function as a chaperone in the maturation of growth factors such as IGF-1 (Chaudhuri 1995). Furin cleaves 7B2 within the Golgi into a 21 kDa fragment and a 3.4 kDa carboxy terminal fragment called 7B2CT. 7B2CT inhibits the maturation and function of PC2 (Zhu 1996; Hwang and Lindberg 2001). Thus, 7B2 is an important chaperone in the secretory pathway for the proper maturation and release of numerous hormones, neuropeptides and growth factors, and 7B2CT negatively regulates the function of PC2. Increased levels of 7B2CT in ALS subjects may be a reactive response to altered enzymatic activities that generate or degrade 7B2CT. Alternatively, increased levels of 7B2CT in ALS may result from Golgi fragmentation within motor neurons during ALS (Gonatas 1992).
The observation that cystatin C and TTR levels are decreased in ALS CSF near the time of symptom onset is of considerable interest but must be interpreted cautiously. One interpretation is that the protein deficiency is causal, triggering one or more cascades of molecular events that reduce motor neuron viability. On the other hand, reduced protein levels in a neurodegenerative disorder like ALS may represent an expected, secondary consequence of neuronal cell loss.
Cystatin C is a 13.3 kDa secreted protein that belongs to the class of cysteine protease inhibitors and plays an important role in regulating extracellular protein homeostasis in the CNS. The choroid plexus is a major site for the synthesis of cystatin C and CSF concentrations of this protein are ~5.5 times that of serum (Davidsson et al. 1997). Decreased levels of cystatin C in ALS may indicate increased proteolysis via cysteine proteases. Cystatin C levels are altered in other neurodegenerative diseases such as Alzheimer's disease (AD) and Creutzfeldt–Jakob disease, and in models of pain (Deng 2001; Levy 2001; Kalso 2004; Sanchez 2004). A recent SELDI-TOF-MS analysis of CSF from AD subjects revealed increased cystatin C levels in AD (Carrette et al. 2003), suggesting cystatin C protein levels may exhibit distinct alterations in different neurodegenerative disorders. Cystatin C is localized to Bunina bodies, a specific neuropathologic hallmark of ALS contained in degenerating motor neurons (Okamoto 1993; van Welsem 2002; Seilhean 2004). Mutations in the cystatin C gene are associated with a rare hereditary brain amyloid angiopathy that also results in decreased CSF levels of cystatin C and increased amyloid aggregation and deposition (Coria and Rubio 1996). Further studies will be pursued to explore potential functional links between cystatin C, protein aggregation and ALS.
In our mass spectrometry study, TTR was resolved into a series of m/z peaks and levels of specific TTR m/z peaks were reduced in ALS CSF. TTR is synthesized predominately in cells of the choroid plexus and liver and secreted into the CSF or plasma, respectively (Schreiber 2002). TTR is also produced in neurons and we report that motor neurons express TTR (Fig. 4). Reduced level of TTR in ALS spinal cord is likely due to a reduction both in the numbers of motor neurons and in the levels of TTR expression within remaining motor neurons. TTR immunoreactivity was also observed in other cell types in the spinal cord, which will be the focus of future studies. We also note that TTR levels are reduced in the CSF of postmortem ALS subjects when compared to healthy controls (data not shown), indicating that reduced levels of TTR spectral peaks in ALS CSF occurs early in the disease process and continues throughout the course of disease. TTR is required for the transport of thyroxine and transport of retinol/vitamin A via interactions with retinol-binding protein (Monaco 2000; Power 2000). Decreased levels of TTR have been noted in the CSF of late stage AD patients (Serot et al. 1997), indicating that decreased TTR levels is not unique to ALS. A recent study indicates that TTR has neuroprotective functions in a transgenic mouse model of AD (Stein et al. 2004). The basis for the neuroprotective effect has not been delineated. Because TTR is known to form aggregates and bind multiple proteins, it is conceivable that TTR deficiency might lead to inadequate sequestration of abnormally functioning proteins. We also note that polymorphisms of the TTR gene induce increased protein aggregation and result in familial amyloid polyneuropathy and familial amyloid cardiomyopathy (Saraiva 2001). Considered together, these findings suggest that low levels of TTR in ALS motor neurons could reduce its neuroprotective function and thus be more susceptible to neurodegenerative insults, a hypothesis that merits further analysis.
We acknowledge Xiaoting Tang and Paul Wood for technical assistance with mass spectrometry; Kate Jordan, Ramasri Sathanoori, Georgina Nicholl and Sarah Henry for assistance with sample preparation. We also thank Drs Marielena McGuire and Jeffrey Scibek from Ciphergen Biosystems for assistance in protein enrichment and identification. We are grateful to Dr Bruce Buchanan for providing us access to Rule Learner and its JAVA version. Funding support provided by the ALS Association (MC, RHB and RB), NIH/NIEHS ES013469 (RB), NIH/NIGMS GM071951 (VG), NIH/NIA AG12992 (RHB) and NIH/NINDS NS46631, NS050557 and NS038679 (RHB). We also acknowledge the support of Project ALS, the Angel Fund, the Al-Athel ALS Foundation and the Pierre L. deBourknecht ALS Research Fund (RHB); ZazAngels and Ride For Life (RB).