|Home | About | Journals | Submit | Contact Us | Français|
Biomarkers for neurodegenerative disorders are essential to facilitate disease diagnosis, ideally at early stages, monitor disease progression, and assess response to existing and future treatments. Application of proteomics to the human brain, cerebrospinal fluid and plasma has greatly hastened the unbiased and high-throughput searches for novel biomarkers. There are many steps critical to biomarker discovery, whether for neurodegenerative or other diseases, including sample preparation, protein/peptide separation and identification, as well as independent confirmation and validation. In this review we have summarized current proteomics technologies involved in discovery of biomarkers for neurodegenerative diseases, practical considerations and limitations of several major aspects, as well as the current status of candidate biomarkers revealed by proteomics for Alzheimer and Parkinson diseases.
Neurodegenerative disorders, such as Alzheimer disease (AD) and Parkinson disease (PD), are a heterogeneous group of disorders characterized by a progressive and selective loss of anatomically or physiologically related neuronal systems (Forman et al., 2004; Lang and Lozano, 1998; Rathmann and Conner, 1984; Skovronsky et al., 2006). While the majority of these diseases affect patients later in life, abundant evidence suggests the existence of a “preclinical” stage that could start years before a subject can be diagnosed, where an individual appears normal but is developing extensive pathological changes in the brain (Litvan et al., 2007; Morris and Price, 2001).
Although many neurodegenerative diseases cannot be cured at the present time, there are often symptomatic treatments available and new drugs are emerging to forestall and/or reverse the onset and/or progress of the diseases (Forman et al., 2004; Skovronsky et al., 2006). Thus, an early diagnosis will at least assist in the better management of patient care. During the development and implementation of neuroprotective therapies, a major task of today, it will be important to have robust biomarkers that can aid in diagnosis, identify subsets of patients (personalized medicine so to speak), and objectively monitor progression and response to treatment. However, the major challenge to the clinical diagnosis of most neurodegenerative disorders is the increasing recognition that a variety of these diseases can mimic each other or coexist with each capable of contributing to the symptomatic expression of brain failure (Forman et al., 2004; Skovronsky et al., 2006). For example, AD and PD commonly co-occur, and a common subtype of AD is the Lewy body (LB) variant (LBVAD). In addition, a single type of pathology can produce a variety of different types of symptoms, making it difficult to reliably identify the pathology based solely on the clinical phenotype. This is best exemplified in patients with dementia with LB (DLB) vs. PD with dementia (PDD); it is exceedingly controversial as to whether they are the same disease with different clinical spectrums or two distinct disorders with identical final pathological outcomes. Thus, additional biomarkers with higher sensitivity and specificity that can be applied widely and at a reasonable cost are needed for greater accuracy, ideally at preclinical stages of disease when therapy is likely to have the greatest impact.
Research into biomarkers for neurodegenerative diseases has taken many directions. Advances have been made in neuroimaging techniques that assess regional structure, function and biochemistry of the brain, as well as in identifying biochemical indices of brain dysfunction, measured in body fluids such as cerebrospinal fluid (CSF), plasma and urine (Clark et al., 2008; Galasko, 2005; Shaw et al., 2007; Sonnen et al., 2007). Most available biochemical markers have been developed as an extension of targeted physiological studies that investigate known pathways such as those believed to be involved in neurodegeneration. These approaches, however, have achieved limited success. By contrast, emerging technologies are beginning to allow the systematic, unbiased characterization of variation in genes, messenger RNA (mRNA), proteins and metabolites associated with disease conditions and identification of novel biomarkers (Han, 2007; Sharp et al., 2006; Zhang et al., 2005a). Of the emerging platforms, proteomics perhaps has garnered more recent attention, largely because proteins are readily available in body fluids (a particularly important point in diseases of the central nervous system [CNS]) and are more stable than mRNA and metabolites. Although still in their infancy compared with other approaches, proteomic technologies offer complementary insight into the full complexity of the disease phenotype. This review will therefore focus on the major proteomic technologies and their applications in AD and PD biomarker discovery and validation. Notably, single protein studies developed as targeted investigation of known pathways using non-proteomic technologies, though not covered extensively in this article, can be found in a few recent reviews elsewhere (Aarsland et al., 2008; de Jong et al., 2007; Shaw et al., 2007; Sonnen et al., 2007).
Proteomics is the study of both the structure and function of proteins by a variety of methods. Advancing technologies, particularly the evolution of 2-dimensional gel electrophoresis (2-DGE) based approaches into liquid chromatography (LC) based high-resolution tandem mass spectrometry (MS/MS), have radically improved the speed and precision of identifying and measuring proteins in biological fluids and other samples. The emerging technology of quantitative proteomics also provides a unique opportunity to reveal static or perturbation-induced changes in a protein profile. However, due to the enormous complexity of biological systems and the complex nature of proteins, an effective, unbiased proteomics profiling (protein profiling without any type of preselection for biomarkers) requires a multi-discipline approach to accomplish the goal of protein identification and quantification. Such a concerted approach typically includes several components, including sample preparation, protein or peptide separation, protein or peptide identification by MS or MS/MS and bioinformatic data processing, as well as independent confirmation and validation.
Human CSF, being most proximal to the CNS, is one of the ideal sources for identifying biomarkers for neurodegenerative diseases. An important consideration is relative availability of CSF obtained by lumbar puncture (LP), making it possible to conduct longitudinal molecular analyses of changes in CSF during the course of diseases. In addition to CSF, the use of plasma, urine, and saliva for biomarker discovery have also been explored as more practical/acceptable approaches for both the patient and clinician; however, these attempts have met less success so far with few exceptions (e.g., Ray et al., 2007), primarily because of the complexity of peripheral samples and confounding factors introduced by other organ systems. Thus, at this stage of discovery, we believe that the best approach might be to establish an ensemble of effective biomarkers in well-characterized human CSF of a relatively small number of individuals involved in research settings, then subsequently pursue their quantification in plasma or urine/saliva for more wide spread application (Zhang, 2007). However, it is also very possible that potential biomarkers can be identified in plasma or urine/saliva first, and then traced back to CNS. Of note, biomarker discovery does not have to start with body fluids. This is because protein biomarkers identified in human brain may also ultimately enter CSF and/or plasma that can be collected clinically in a living patient. It is indeed an approach currently taken for PD biomarker discovery (see later section for detailed discussion).
Like any other field, the quality of the proteomic results strongly depends on the condition of the starting material. Details on CSF, plasma or urine sample preparation have already been discussed in some recently published reviews (Drabik et al., 2007; Galasko, 2005; Thongboonkerd, 2007; Zhang, 2007). Here we will only point out some major principles in handling CSF, a commonly utilized sample source for proteomic analysis of CNS disease. Firstly, for CSF collection, given the production/absorption dynamics of CSF, it is suggested to match the fractions of CSF collected at LP and the timing of LP in addition to subjects’ age, gender and other common variables (Zhang, 2007). Secondly, later fractions (15–25th mL) from each subject are preferred in biomarker discovery studies because they are closer to the brain and therefore, at least in theory, the concentrations of proteins derived from the brain are relatively higher than those contained in earlier fractions of LP. Furthermore, blood contamination needs to be controlled when assessing the quality of CSF samples. A minute contamination with blood can have a large effect on the concentration of CSF proteins (due to an enormous plasma/CSF protein concentration ratio), thereby confounding quantitative proteomics analysis (see Zhang, 2007 for details). Finally, CSF is known to contain high amounts of salts and low protein concentration; desalting and pre-fractionation steps should therefore be considered during preparation of CSF for analysis (Drabik et al., 2007; Zhang, 2007).
Proteins in a sample typically can be separated by one of three methods: 2-DGE (Friedman et al., 2004; O’Farrell, 1975), LC (Link et al., 1999; Washburn et al., 2001), or, more recently, “protein chips” (Hutchens and Yip, 1993; Yip and Lomas, 2002). The next step is the identification of the separated proteins, which is performed using MS techniques and bioinformatics. The type of MS used is typically paired with the different methods of protein separation: 2-DGE and LC with either matrix assisted laser desorption/ionization (MALDI) or electrospray ionization (ESI) MS, and protein chips with one particular type of MALDI (to be discussed further after introduction of MALDI technology), which together are called surfaced enhanced laser desorption/ionization (SELDI). Each technology is used to determine different types of information and has its own strengths and limitations (Table 1).
2-DGE separates proteins according to charge and size in two discrete steps: isoelectric focusing (IEF) and sodium dodecyl sulphate polyacrylamide gel electrophoresis (SDS-PAGE). This separation results in a distinct pattern of protein spots that can be identified using MS methods, providing protein identification as well as quantitative information of a given candidate biomarker. Additionally, 2-DGE separation stands out for its remarkable resolving power and utilization of various dyes that allow for quantitative analysis, leading to increased sensitivity, reproducibility, and throughput of proteome analysis (Friedman et al., 2004; Friedman et al., 2007). Furthermore, 2-DGE is highly suitable for the study of post-translational modifications (PTMs) of proteins, because isoforms are easily resolved. For these reasons, many investigators have utilized this technology to look for CSF and plasma biomarkers in neurodegenerative diseases like AD (Castano et al., 2006; Davidsson et al., 2002; Finehout et al., 2007; Hu et al., 2007; Hu et al., 2005; Hye et al., 2006; Puchades et al., 2003). Some limitations of 2-DGE are that it is relatively laborious, not applicable for proteins/peptides smaller than 10 kDa, troubled by co-migration issues (one stained spot may contain more than one protein) and it has a limited use for highly hydrophobic proteins (Thongboonkerd, 2007; Zhang, 2007).
In contrast, the online application of nanoscale LC and MudPIT (multi-dimensional protein identification technique) with MS or MS/MS has resulted in the identification of peptides in mixtures in a single analysis and offered an increased potential to detect low abundance proteins (Drabik et al., 2007). Indeed, the number of proteins identified in each proteomics experiment has increased exponentially since LC-, especially MudPIT, based proteomics has been used to characterize the CSF proteome (Abdi et al., 2006; Maccarrone et al., 2004; Pan et al., 2007b; Ramstrom et al., 2004; Wenner et al., 2004; Zhang et al., 2005b). The key to the success of this technology is separation of digested peptides, i.e., better chromatography leads to increased protein identification. However, this technology is also relatively time-consuming, less effective in detecting proteins with PTMs, and sensitive to interfering compounds. Moreover, quantitative analysis using LC-MS/MS is not as straightforward as with 2-DGE (see discussion below).
As the center of a proteomics analysis, modern MS consists of three major modules: ion source, mass analyzer and detection unit. Based on the difference of ion source used, most of the mass spectrometers used in the field of proteomics can be generally categorized into two types, i.e. ESI and MALDI instruments. The most widely used mass analyzers include ion trap, triple quadrupole, time-of-flight (TOF) and Fourier transform ion cyclotron (FTICR). They are very different in their mechanism of ion separation, mass accuracy and resolution, and complementary in protein identification when used in concert (Caudle et al., 2008).
SELDI was introduced as a variation on the MALDI concept (Kiehntopf et al., 2007; Yip and Lomas, 2002). What makes this technique unique is its connection to protein chip array, either a chemically preactivated surface (ionic, hydrophobic, and chelating metal, etc.) or a protein specific surface (antibody, receptor) for selective capture of proteins. The chip surface serves to fractionate and enrich subpopulations of proteins from complex protein mixtures, thereby simplifying the analysis. In addition, the chip affinity feature allows samples to be prepared under buffer conditions that might otherwise interfere with more traditional MS analysis. The proteins that are retained on the surface are subsequently ionized and detected by MALDI-TOF MS. There is, however, an apparent disadvantage of SELDI; specifically, it typically analyzes the absence or presence of a particular signature peak rather than determining the relative abundance of a protein unless extensive postchip work-up is performed. In addition, its reproducibility is still an issue of concern and proteins with high molecular weights are also analyzed less efficiently by SELDI (Thongboonkerd, 2007; Zhang, 2007). Despite these caveats, SELDI has been successfully applied to analysis of potential biomarkers in the CSF of AD and frontotemporal dementia patients (Carrette et al., 2003; Ruetschi et al., 2005; Simonsen et al., 2008).
A necessary and informative step in protein biomarker discovery is to detect quantitative alterations of a protein under different disease and control settings. Traditionally, proteins are quantified by 2-DGE, which has many limitations as discussed above (de Jong et al., 2007; Friedman et al., 2007; Zhang, 2007); thus, development of novel quantitative proteomics represents a milestone of proteomics technology. In the past several years, many MS-based quantitative proteomics methods have been developed (Aebersold and Mann, 2003). These methods include the use of chemical reactions to introduce isotopic tags at specific functional groups on peptides or proteins, such as isotope coded affinity tags (ICAT) (Gygi et al., 1999) and isobaric tags for relative and absolute quantitation (iTRAQ) (Ross et al., 2004), as well as others.
ICAT labels the side chains of cysteinyl residues in two reduced protein samples using the isotopic light or heavy reagent, respectively, and generates the mass signatures that identify sample origin and serve as the basis for accurate quantification, thus affording comparison of two proteomes simultaneously. Using ICAT analysis, most procedures of sample preparation are performed after the combination of the two compared groups, largely removing variations secondary to sample processing. Indeed, this technology has been utilized to investigate changes in the CSF proteome associated with AD (Zhang et al., 2005b), identifying approximately 400 proteins, many of which were altered in relative abundance in aging and/or AD. Nonetheless, there are several limitations to ICAT, including: (i) quantification is restricted to cysteine-containing proteins; and (ii) it can only compare two conditions at a time. To overcome these limitations, several novel isobaric tagging based methods (iTRAQ, for example) have been developed. iTRAQ is based on chemically tagging the N-terminus of peptides generated from protein digests that have been isolated from samples in, for example, different disease and control groups. The technique allows for the identification and quantification of all peptides as well as comparison of up to eight conditions simultaneously (Fig. 1). We have recently utilized this method to compare the CSF proteome of patients with AD, PD, and DLB as compared to age-matched controls (Abdi et al., 2006).
More recently, label-free quantitative proteomics in LC-MS/MS is also being actively investigated, exploring yet another technique for comparative proteomics. These techniques, though not being utilized yet in neurodegenerative disease proteomics and biomarker discovery, appear to provide excellent quantitative data at least for abundant proteins (Cutillas and Vanhaesebroeck, 2007; Finney et al., 2008). SELDI also possesses a quantitative ability without having to use an additional tag; however, as it stands now, quantitative analysis by this technology is not reproducible. Thus, it is essential to better understand the factors that influence the quantification of peptides and proteins in the analysis of complex biological samples and to develop methods that permit a more robust and precise quantification (Kiehntopf et al., 2007).
The biomarkers identified by proteomics need to be validated before their clinical utility is pursued. At the protein level, the most widely used technology to date for biomarker verification and validation is the enzyme-linked immunosorbent assay (ELISA), or, more recently, the bead-based multi-analyte profiling (MAP) technology (such as Luminex) (Olsson et al., 2005). The major difficulty associated with these approaches centers on the availability of specific antibodies against novel proteins identified by proteomics and the difficulty to detect changes of low abundant proteins/peptides by these methods. To circumvent this problem, several groups of investigators are actively involved in MS-based targeted quantitative proteomics (Aebersold, 2003; Anderson, 2005; Gerber et al., 2003; Pan et al., 2005). This technology uses isotope dilution followed by MS analysis, in which test-samples are supplemented (spiked) with isotope-labeled synthetic peptides, which serve as the signature markers for the identification and quantification of native peptides (target) within each sample. More recently, these techniques have been utilized to examine proteins in human serum as well as CSF (Anderson and Hunter, 2006; Anderson et al., 2004; Pan et al., 2008; Pan et al., 2005). These early investigations have demonstrated the feasibility and advantages of MS-based targeted quantitative proteomics to simultaneously identify and quantify a panel of selected peptides/proteins in a complex milieu, and consequently could be applied to biomarker verification/validation for neurodegenerative diseases such as AD and PD. Moreover, targeted proteomics is also advantageous in dealing with markers low in abundance, a typical scenario when assessing CNS markers in peripheral plasma or even CSF, and proteins/peptides with PTMs (see below for more details). However, currently its implementation is still relatively expensive and technologically challenging, thus limiting its widespread clinical applications.
Over the last few years, proteomic techniques have been utilized to discover novel markers in different neurodegenerative disease settings, with AD being studied most extensively. As AD biomarkers revealed by proteomics have been discussed extensively by several groups (de Jong et al., 2007; Shaw et al., 2007; Sonnen et al., 2007; Zhang, 2007; Zhang et al., 2005a), we will only provide a brief summary on this topic before discussing PD biomarkers more extensively.
All three platforms, 2-DGE-, LC-, and SELDI-based proteomics, have been employed to look for AD biomarkers in human CSF and plasma/serum over the last decade (Abdi et al., 2006; Carrette et al., 2003; Castano et al., 2006; Davidsson et al., 2002; Finehout et al., 2007; Hu et al., 2007; Hu et al., 2005; Puchades et al., 2003; Simonsen et al., 2008; Zhang et al., 2005b). Among them, with 2-DGE, Finehout et al. identified a panel of 23 spots that could be used to differentiate AD from non-AD with high sensitivity and specificity (Finehout et al., 2007); Hu et al. revealed 18 proteins/isomers that were altered significantly in mild AD patients versus controls and validated selected candidates using ELISA (Hu et al., 2007). Using LC-MS, Abdi and colleagues revealed more than 100 candidate markers, including brain-derived neurotrophic factor (BDNF), interleukin (IL)-8, vitamin D binding protein (VDBP), apolipoprotein (apo) AII, and apoE, as potential CSF biomarkers for AD (Abdi et al., 2006; Zhang et al., 2005b; Zhang et al., 2008). Furthermore, with SELDI-TOF-MS, Carrette et al. identified four CSF proteins (cystatin C, two β2-microglobulin isoforms, VGF polypeptide and one unidentified protein) as novel potential biomarkers for AD (Carrette et al., 2003). In a more recent SELDI study, 15 potential biomarkers were identified and a panel of five markers (Cystatin C, truncated Cystatin C, Amyloid beta1–40 [Aβ1–40], C3a anaphylatoxin des-Arg and a 4.0 kDa protein) together with total tau (t-tau) and Aβ1–42 analysis could distinguish AD from healthy control individuals with high sensitivity and specificity (Simonsen et al., 2008). Finally, because blood is more easily accessible than CSF, a search is also underway for useful plasma biomarkers in AD. For example, using 2-DGE analysis of plasma proteins combined with LC-MS/MS characterization, Hye et al. demonstrated increased concentrations of complement factor H and α-2-macroglobulin (Hye et al., 2006). In another study, four potential biomarker peaks were also identified using the serum albumin-bound fraction from AD and control subjects (German et al., 2007). Additionally, as mentioned earlier, biomarkers could also be identified in human brain tissues first; studies have applied proteomic technologies to characterize specific proteins in AD brain, for example, using redox proteomics (a branch of proteomics that identifies oxidatively modified proteins) a number of proteins that are oxidatively modified in AD brain have been identified (Butterfield and Sultana, 2007; Sultana et al., 2006).
There are several issues, however, related to the candidate markers identified in some of these studies (de Jong et al., 2007; Zhang, 2007), including: i) few studies are conducted with pathologically verified cases, ii) a significant portion of the proteins identified in CSF are also present in human plasma (yet blood contamination of CSF is not controlled for during sample preparation), iii) few candidate markers are confirmed by alternative techniques or in independent studies. In addition, several of the proteins identified, e.g., (truncated) cystatin C and VGF, have also been nominated as biomarkers for amyotrophic lateral sclerosis (Pasinetti et al., 2006; Ranganathan et al., 2005), schizophrenia (Huang et al., 2006), multiple sclerosis (Irani et al., 2006) and other diseases (de Jong et al., 2007; Zhang, 2007), suggesting that these proteins may not be AD-specific. This emphasizes that a disease control(s) needs to be included in experimental design to increase the specificity of the candidate markers. Finally, all markers identified in peripheral samples (i.e., plasma/serum or urine) need to be determined as to whether they can be influenced by other organ systems, as patients with AD and PD are also typically associated with diseases of other organ systems due to aging.
PD is characterized by a profound and relatively selective loss of neurons in the brainstem, including dopaminergic neurons in the substantia nigra pars compacta (SNpc) and the presence of cytoplasmic α-synuclein-enriched inclusions called LBs in surviving neurons (Forman et al., 2004; Lang and Lozano, 1998; Moore et al., 2005). Spreading of LBs from the brainstem to the limbic system and, eventually, to the isocortex has been described in advanced PD with dementia (Braak et al., 2003). As the first step towards biomarker discovery for PD, proteomic technologies have been applied extensively to characterize human brain and CSF proteomes under the disease settings.
In tissue-based proteomics, using 2-DGE, Fasano and colleagues performed the first unbiased profiling work in human SNpc, with 44 proteins identified and nine of them differentially expressed in PD tissues with respect to controls (Basso et al., 2004). With the same technology, Werner et al identified 221 differentially expressed spots in SNpc between PD and healthy control groups (Werner et al., 2008). Sixteen proteins were identified from 25 selected spots, including ferritin H, glutathione-S-transferase (GST) M3, P1, O1 and glial fibrillary acidic protein (GFAP). A more extensive investigation of the human SNpc performed more recently identified 1263 non-redundant proteins in total (Jin et al., 2006; Kitsou et al., 2008). In this study, proteins were identified and quantified by comparing samples of SNpc from PD and control patients labeled with iTRAQ or ICAT before being analyzed by MALDI TOF/TOF and linear trap quadrupole (LTQ) mass spectrometric platforms, respectively. Many identified proteins displayed significant differences in their relative abundance between the disease and control groups, one of which, mortalin, is substantially decreased in PD brains as well as in a cellular model of PD (Jin et al., 2006). Attempts were also made to characterize human cortical proteins that were affected during PD progression, including development of PD dementia, using LC-MS/MS-based technology together with iTRAQ labeling (Pan et al., 2007a; Shi et al., 2008). Over 800 proteins were confidently identified in the study, and approximately 200 proteins, including mortalin, were found to display significant differences between PD patients at various stages and controls. Differently expressed proteins identified in human midbrain and cortex (Kitsou et al., 2008; Pan et al., 2007a) can be classified into multiple categories based on Gene Ontology analysis and annotations (Fig. 2). A subset of these proteins is under intensive investigation as to their contribution to PD development and progression. The deregulated proteins in PD tissue are also being interrogated as potential biomarkers in body fluids, including plasma, using targeted proteomics as discussed earlier in the section of biomarker validation.
Along with characterization of tissue biomarkers initially, followed by targeted proteomics in body fluids, several groups have also tried to characterize the human CSF proteome as extensively as possible (see review by Zhang, 2007). As of today, approximately 3,000 proteins have been identified when both qualitative and quantitative results are combined. In an iTRAQ study, where relative changes in the CSF proteome were compared among patients with AD, PD, and DLB compared to healthy controls (Abdi et al., 2006), more than 1500 proteins were identified in human CSF; of those, 136, 72, and 101 proteins appeared to be uniquely associated with AD, PD, and DLB, respectively. Some of the unique proteins, including apolipoprotein H/beta-2-glycoprotein 1 (Apo-H), ceruloplasmin and chromogranin B (secretogranin I) that are unique to PD, were further confirmed in the initial study using Western blotting.
In this study, apo-H and ceruloplasmin appeared to be able to segregate PD from controls and other diseases very well with a sensitivity of 67% and 56% at 95% specificity, respectively. Being an important protein for iron transportation, ceruloplasmin is of great interest because it has been implicated to play a central role in PD pathogenesis (Hochstrasser et al., 2004; Torsdottir et al., 1999). The influence of chromogranin B, a non-significant marker by itself in confirmation studies, on the overall performance of Apo-H was also remarkable, as it significantly improved the sensitivity of Apo-H to differentiate PD from controls as well as other diseases. Notably, a study performed years ago has suggested that although chromogranin B cannot differentiate AD or PD from controls by itself, the ratio between chromogranin A and B may be a correcting factor for neuropeptides seen in human CSF (Eder et al., 1998). Besides the proteins confirmed in the initial investigation, several other markers have been tested in a different and larger set of patients/subjects, in which a panel of eight CSF proteins (BDNF, IL-8, VDBP, β2-microglobulin, apoAII, apoE, plus tau and Aβ1–42) appears to be able to classify PD patients with 95% sensitivity and 95% specificity using Luminex-based MAP technology (Zhang et al., 2008).
At this point, it should be stressed that the overlap between the list of deregulated proteins in PD brain tissue and those discovered in PD CSF is exceedingly low. A few factors have likely contributed to this outcome. First, brain tissue-derived proteins may not enter CSF (even though they can get access to blood via the blood-brain barrier). Second, CNS specific proteins are low in concentration in CSF; therefore, they are easily missed during unbiased proteomics profiling. This is an issue that is being actively investigated by peptide-based targeted proteomics as described above in both human CSF and plasma.
Besides large-scale proteomic studies, single protein studies have also been developed to study known key proteins in the pathogenesis of PD as potential biomarkers. For example, an emerging body of work has shown that α-synuclein and DJ-1 can be identified in CSF and plasma (El-Agnaf et al., 2006; Maita et al., 2008; Tokuda et al., 2006; Waragai et al., 2007; Waragai et al., 2006). Reduced levels of α-synuclein in the CSF was reported to associate with increasing severity of parkinsonism in patients with PD, and preliminary findings showed a significant increase in α-synuclein oligomers in plasma in PD patients compared with controls (El-Agnaf et al., 2006; Tokuda et al., 2006). DJ-1 was also shown to be increased in the CSF and plasma of PD patients, and the levels were correlated with the disease severity (Waragai et al., 2007; Waragai et al., 2006).
After biomarkers are discovered, whether in tissues or body fluids, candidate proteins need to be validated in different cohorts of patients/controls. As discussed, multiplex immunoassays (ELISA and MAP) are clearly one of the most practical ways of testing biomarkers in a clinical setting in the near future; nevertheless, immunoassays are limited by the availability of specific antibodies. This issue is more challenging if the proteins of interest are novel, a common scenario when proteins are discovered by proteomics. In our previous unbiased protein profiling, only 30% of the markers have been validated with commercially available antibodies with respect to both protein identification and quantification (Abdi et al., 2006). We have speculated the following reasons that could account for this phenomenon. First, some antibodies are nonspecific. Second, given the caveats associated with incompleteness of current databases, one could argue that a protein confirmed with Western blot may not be the protein quantified by proteomics. Finally, some proteins may carry PTMs, e.g., glycosylation of ceruloplasmin. Because modified peptides are typically not taken into account when the relative ratios of peptides/proteins are calculated among different groups in usual quantitative profiling proteomics, it is entirely possible that the average ratio of non-modified peptide quantification for a protein shows significant relative changes but there is no detectable difference when the whole protein is assessed by Western blotting. An equally important point is that quantitative changes of modified peptides (e.g., in the case of glycosylated ceruloplasmin) may be different from that obtained from general profiling. Thus, it is beneficial to focus on unique peptides, rather than intact proteins, revealed in discovery proteomics as biomarkers.
A related issue is that endogenous peptides and small proteins, although typically missed by gel-based approaches or shotgun sequencing approaches of trypsin-digested proteomes, have been identified in CSF and plasma (Abdi et al., 2006; Carrette et al., 2005; Jimenez et al., 2007a; Jimenez et al., 2007b; Yuan and Desiderio, 2005). To analyze these endogenous peptides, a new field called peptidomics has been developed (Schulz-Knappe et al., 2005; Svensson et al., 2007). This new technology is essentially a comprehensive proteomics analysis of peptides and small proteins of a biological system corresponding to the respective genomic information. Many CSF peptides are biologically active in normal and disease states (Zhang, 2007). In addition, since proteins can be cleaved aberrantly (e.g., cleavage of amyloid β protein by β and γ secretase in AD), direct de novo analysis of endogenous peptides could be invaluable for disease diagnosis. Consequently, it is entirely possible that an ideal panel of biomarkers with high sensitivity and specificity in detecting various neurodegenerative disorders/or disease progressions is comprised of unique native or digested peptides in body fluids.
PTMs of proteins, such as phosphorylation, ubiquitination and glycosylation, are among the key biological regulators of function, activity, localization, and interaction. Aberrant modifications have now been recognized as an attribute of many mammalian diseases, including neurodegenerative diseases (Ballatore et al., 2007; Golab et al., 2004; Mechref et al., 2008), suggesting the use of these markers for disease activity and prognosis. More detailed discussions of the current technologies being utilized as well as potential shortcomings and limitations of the analysis of PTMs as biomarkers of neurodegenerative disease can be found in recently published reviews (Caudle et al., 2008; Zhang, 2007). It should be emphasized, however, that it is unlikely to have specific antibodies available commercially against proteins/peptides with novel PTMs in a short time; therefore, peptide-based MS or MS/MS will remain to be the main technology in confirming/validating the candidate proteins/peptides with PTMs in the next few years.
One of the frustrating issues in biomarker discovery is that most markers discovered by one group cannot be reproduced by others. Some of these variations may be biological, as human beings are extremely heterogeneous, and most diseases studied are very complex. To resolve this issue, multiple markers may be needed, as a combination of independent markers likely enhances the performance of the putative markers required (Abdi et al., 2006; Clark et al., 2006; Simonsen et al., 2008). On the other hand, standardization of data reported is also critical to reducing inconsistency among different studies. It is almost certain that the difference in the database has contributed to the current inconsistent (and often contradicting) results in CSF biomarker discovery (Zhang, 2007). As one of the initiatives to address this issue, we have recently summarized and standardized all of our CSF proteomics results, as well as made the original MS data available to investigators in the field (Pan et al., 2007b) so that future discoveries can be compared to the published results meaningfully. To this end, HUPO (Human Proteome Organization) has been one of the leaders in organizing the initiatives of standardizing the generation and report of proteomics data.
There is an urgent need for biomarkers to diagnose neurodegenerative diseases early in their course, to differentiate them from other related diseases or subtypes, and to monitor responses of patients to new therapies. Despite caveats discussed above in proteomics-based biomarker discovery, quite a few promising biomarkers have been described for AD and PD. With improved experimental design/sample preparation and implementation of advanced methodologies and analysis tools as well as standardization of reporting proteomics, proteomics will likely contribute significantly to managing neurodegenerative diseases in the years to come, when baby boomers become vulnerable to neurodegenerative disorders, like AD and PD.
This work was supported by the National Institutes of Health (grants ES012703, NS057567, NS060252, and AG025327) and a Shaw Endowment to J.Z.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.