|Home | About | Journals | Submit | Contact Us | Français|
Parkinson’s disease affects 5 million people worldwide, but the molecular mechanisms underlying its pathogenesis are still unclear. Here, we report a genome-wide meta-analysis of gene sets (groups of genes that encode the same biological pathway or process) in 410 samples from patients with symptomatic Parkinson’s and subclinical disease and healthy controls. We analyzed 6.8 million raw data points from nine genome-wide expression studies, and 185 laser-captured human dopaminergic neuron and substantia nigra transcriptomes, followed by two-stage replication on three platforms. We found 10 gene sets with previously unknown associations with Parkinson’s disease. These gene sets pinpoint defects in mitochondrial electron transport, glucose utilization, and glucose sensing and reveal that they occur early in disease pathogenesis. Genes controlling cellular bioenergetics that are expressed in response to peroxisome proliferator–activated receptor γ coactivator-1α (PGC-1α) are underexpressed in Parkinson’s disease patients. Activation of PGC-1α results in increased expression of nuclear-encoded subunits of the mitochondrial respiratory chain and blocks the dopaminergic neuron loss induced by mutant α-synuclein or the pesticide rotenone in cellular disease models. Our systems biology analysis of Parkinson’s disease identifies PGC-1α as a potential therapeutic target for early intervention.
Parkinson’s disease (PD) is a genetically complex, neurodegenerative disease with a significant but small genetic risk. In patients with PD, movement, sleep, autonomic functions, and cognition become progressively impaired. Degeneration of dopamine neurons in the substantia nigra of the brain and α-synuclein–positive Lewy bodies in brainstem and neocortex are found at autopsy, but the underlying etiology of the disease is not known.
Both environmental chemicals and genetic susceptibility are thought to contribute to the etiology of sporadic PD. Epidemiological research indicates that exposure to pesticides including rotenone (1), and welding elevates risk of PD (2). The pesticide rotenone inhibits complex I [proton pumping NADH (reduced form of nicotinamide adenine dinucleotide):ubiquinone oxidoreductase] of the mitochondrial respiratory chain in dopaminergic cells (3, 4) and reproduces many of the features of PD including α-synuclein inclusions in rats (5). Intravenous injection of another complex I inhibitor, 1,2,3,6-methyl-phenyl-tetrahydropyridine (MPTP), a contaminant in synthetic opiates, causes acute, permanent parkinsonism and dopamine neuron death in humans (6). By contrast, caffeine and tobacco are associated with reduced risk of PD (2).
Although most PD cases are sporadic, the discovery of genes linked to rare familial forms of disease due to mutations in the SNCA (α-synuclein), PARK2, DJ-1, PINK1 and LRRK2 genes has provided important clues about the disease process (7). Loss-of-function mutations in two genes linked to autosomal recessive PD - the nuclear-encoded mitochondrial gene PINK1 [PTEN (phosphatase and tensin homolog)-induced putative kinase-1] (8) and the E3 ubiquitin ligase PARK2 - disrupt mitochondrial function (9). Overexpression of SNCA carrying the familial PD-linked A53T mutation inhibits mitochondrial complex I in dopaminergic cells (10). In the common sporadic disease, α-synuclein and degenerating mitochondria (11) are major components of Lewy bodies—the hallmark cytoplasmic inclusions found in patient brains—and biochemical complex I deficiency is found in the substantia nigra and in platelets (7).
Massively parallel analysis of messenger RNA (mRNA) transcripts can provide an unbiased, global estimate of changes in gene expression and identify genes (12, 13) and pathways causally, reactively, or independently associated with genetic, environmental, or complex disease etiologies (13, 14). Gene expression data can be used to classify individuals according to molecular characteristics (15) and to generate hypotheses about disease mechanisms (16), and may be particularly useful for decoding complex diseases with considerable environmental and epigenetic contributions not readily explained by variations in DNA sequence. In practice, the power of genome-wide expression technology has been encumbered by discordant analyses, nonreplication, and small sample sizes typical of human studies. This problem is sharply brought into focus by studies of substantia nigra, a small region in the brainstem particularly vulnerable to PD, for which only very limited numbers of high-quality, snap-frozen, postmortem samples are globally available.
Here, we have analyzed variation in expression of multiple members of one molecular pathway (groups of genes that encode a biological process), with the power afforded by random-effects model meta-analysis of 17 studies (five previously unpublished), including analysis of nine laser-captured dopamine neuron and substantia nigra postmortem tissue investigations (Table 1) (15, 17–24). We used standardized processing of raw data from genome-wide expression studies, powerful analysis of biologically linked sets of genes, and rigorous replication. To detect functionally important, coordinated changes in gene expression, we assessed multiple members of each biological pathway. We first applied a nonparametric rank-based method, Gene Set Enrichment Analysis (GSEA) (25, 26) which combines information from the members of biological pathways to increase the signal relative to noise. GSEA is advantageous compared to widely used parametric pathway analysis methods that are based on the hypergeometric test because no arbitrary cutoffs for enrichment are introduced (25, 27).
Combining the results of multiple independent studies increases the statistical power and precision of pathway associations when scarce human brain samples prohibit individual studies of large scale. Microarrays from multiple studies are sometimes considered to be part of one big study (the “pooling participants” method). Because unequal group sizes, in the presence of a lurking, confounding bias, can weight effect estimates incorrectly, results based on this method can be flawed or even outright paradoxical (Simpson’s paradox) (28). A more objective strategy compares pathway associations with a phenotype within each genome-wide expression study (GWES) and then averages the estimates across multiple studies (29). Because GWESs typically differ vastly in sample size and in variance (a result of human biology, disease heterogeneity, and biospecimen processing), simply averaging effect estimates is not appropriate. A positive result in such a test can be due solely to bias rather than any relationship between pathway members and the phenotype of interest. It is important to weight averages to account for a data set’s sample size and noise, instead of simply averaging associations from small and large, noisy, and high-quality data sets.
We overcame these challenges using random-effects model meta-analysis (termed meta-GSEA) and probed 522 gene sets for associations with PD across a total of 17 GWESs from 322 human brain and 88 blood samples in a three-stage design (nine, one, and seven GWESs included in stages 1, 2, and 3, respectively). This meta-analysis method uses weighted effect estimates that account for each data set’s sample size and noise in order to estimate a summary effect for each gene set’s association with PD.
We uniformly processed individual-level raw data (.CEL files), annotated probes to 522 curated biological processes, pathways, and gene sets (henceforth gene sets) predefined in the Broad Institute’s Molecular Signatures Database (MSigDB) C2 v1.1 (25). MSigDB is a compilation of curated sets of genes that share common biological function or regulation based on evidence from pathway encyclopedias such as KEGG, BioCarta, GenMAPP; gene expression signatures systematically extracted from PubMed publications; and knowledge of domain experts. Using these gene sets and gene expression intensities, we tested for association of each of the 522 gene sets with PD in each of nine GWESs using GSEA (25), examined for study outliers, and then combined these results in a random-effects model meta-analysis across a total of 185 microarrays (99 from cases and 86 from controls) (stage 1 analysis; Table 1). One outlier data set was observed (fig. S1) but was conservatively included in the primary meta-analysis according to our intention to exclude gene sets solely based on study entry criteria [removal of the outlier data set further increased the strength of the observed associations (table S1)]. To control for differences in dopamine neuron numbers in patients with PD and control individuals, we derived three of the nine GWESs from dopamine neurons laser-captured from the substantia nigra of patients and control individuals. One of these GWESs is first presented here (Middleton-1) and one was unpublished at the time of analysis (NBD). In GSEA, for a given pair wise comparison, genes are ranked according to the difference in expression, and a nonparametric Kolmogorov-Smirnov statistic is applied to determine whether the rank ordering is random or associated with the disease phenotype (25). Association of a gene set with the disease phenotype is quantified by normalized enrichment scores (NESs), with the sign of the NES indicating positive (overexpression) or negative (underexpression) enrichment of a gene set (25).
We then introduced a random-effects model meta-analysis to combine effect estimates for each of the 522 gene sets across nine laser-captured microdissected dopamine neuron (DA) and substantia nigra studies, termed meta-GSEA. This meta-analysis uses weighted averages of a NES to account for a data set’s sample size and noise in order to estimate a summary NES (sNES). sNESs and 95% confidence intervals were calculated on the basis of a random-effects model, which uses weights that incorporate both within-study and between-study variance (30). The genome-wide significance threshold was conservatively specified as P < 9.6 × 10−5 (0.05 divided by 522, the number of gene sets tested). This Bonferroni correction is likely overly restrictive, because several gene sets are partially overlapping and therefore not truly independent tests. Twenty-eight gene sets with P values of <9.6 × 10−5 (range, <10−8 to 0.00008) met our significance threshold (Fig. 1A and table S2). Key pathways were enriched across GWESs from substantia nigra homogenates (Zhang, Papapetropoulos, Moran, Miller, Hauser, and Grünblatt in Figs. 2 and 3, A and D) and GWESs derived from dopamine neurons laser-captured from substantia nigra [DA; data sets NBD and Middleton-1 in Figs. 2, 3, A and D, and and4A;4A; the third DA data set (Cantuti) is a technical outlier (see fig. S1)]. Because we examined individual neurons in the DA data sets, these results cannot be explained by differences in proportions of dopamine neurons or glia assayed in the tissue.
We next examined the stability of these results (fig. S2). We iteratively left one study out at a time, performed meta-GSEA for the remaining eight studies, and scored the number of times 1 of the 28 gene sets achieved genome-wide significance in the left-in studies. Stability was 100% (P < 9.6 × 10−5 in each of nine unique iterations) for 19 gene sets. Thus, the meta-analysis results for 19 of 28 gene sets were highly stable and unaffected by removal of an individual study from the meta-analysis, whereas the results for the remaining gene sets were less stable.
We selected gene sets for validation conservatively on the basis of the statistical evidence for association in stage 1. Unstable gene sets were included according to our intention to forward gene sets based on the genome-wide level of evidence. To determine whether the 28 gene sets identified in stage 1 are already perturbed in incipient PD (31–36), we interrogated pathway changes in postmortem substantia nigra of 16 cases with subclinical, PD-related, α-synuclein–positive, incidental Lewy body disease (PD-LBN), and in 17 age-, sex-, and postmortem interval (PMI)–matched controls without PD-related lesions on autopsy and also free of neurologic disease (Table 1 and table S3).
The presence of α-synuclein immunoreactive inclusions in neuronal cytoplasm (Lewy bodies) is mandatory for the neuropathologic confirmation of the clinical diagnosis of classical PD (35). In the clinically recognizable motor phase of sporadic PD, most patients display bradykinesia and resting tremor. The preclinical phase of PD has been estimated to precede motor impairment by >5 years based on nigral neuropathology, striatal dopamine transporter imaging, and nonmotor clinical symptoms (37). It is expected for any slowly progressive, aging-dependent disease that not all individuals live to cross the threshold from a subclinical phase to the clinically symptomatic phase (36). Indeed, an estimated 5 to 10% of individuals without motor symptoms during life display typical α-synuclein–positive Lewy bodies and neurites (36, 38), a reduction of dopamine markers (32–34), and mild and select loss of ventrolateral substantia nigra neurons on autopsy (35, 39). This had led to the concept that brainstem α-synucleinopathy represents probable early, subclinical PD (32, 33, 35, 36, 39–41), although this is not without controversy (40, 42). In individuals with incidental Lewy body disease, the distribution of Lewy body inclusion is similar to that in PD (34, 36, 41, 43). The mild loss of substantia nigra neurons found in these individuals follows the regional pattern of cell loss seen in PD (predominant loss of the ventrolateral tier of the substantia nigra), but not in normal aging (predominant loss of the dorsolateral tier, with the ventrolateral tier being unaffected) (35, 39), This mild loss is associated with decreased striatal tyrosine hydroxylase (TH) (32–34) and vesicular monoamine transporter 2 immunoreactivity (33) that is intermediate to that seen in patients with late-stage PD and in healthy controls. As in overt PD, biochemical markers of oxidative stress are also increased in persons with incidental Lewy body disease (44, 45). Importantly, these individuals have not been treated with dopamine replacement therapy, thereby excluding PD medications as a confounding factor in the gene set analysis.
Twelve of the 28 gene sets were significantly associated with subclinical, mild, PD-related Lewy body neuropathology (36, 41), with P values of <0.05 (Table 2 and Fig. 3, B and E), suggesting that changes in these pathways reflect early pathobiological processes that may be prime targets for disease-modifying therapeutics.
In PD, α-synuclein immunoreactive Lewy bodies and Lewy neurites are present in a diversity of neurons outside the substantia nigra (36), and molecular or biochemical changes can be found in peripheral neuronal and even nonneuronal cells of patients with PD (46, 47) [and references in (13)]. Together with work in yeast and animal model systems (48), these data may indicate that dopamine neuron degeneration in PD is the result of general cellular defects to which dopamine neurons are simply more sensitive than other cells. To evaluate this hypothesis, we took the 12 gene sets forward for further evaluation in a total of 192 nonnigral samples (106 from cases, 86 from controls without neurodegenerative disease) from seven additional GWESs (Table 1). These include brain regions that show abundant Lewy body pathology without neuron loss in PD, such as frontal cortex (FC) and prefrontal cortex Broadman area 9 (BA9), as well as basal ganglia structures that are affected by biochemical changes in PD such as globus pallidus (GP) and putamen (PU). Additionally, one data set from human lymphoblastoid (LB) cell lines and one from whole cellular blood were included, because defects in mitochondrial function (46) and dopamine signaling (47) have been reported in peripheral blood cells in early untreated PD. These data sets represent, to our knowledge, all nonnigral GWESs in sporadic PD available at the time of analysis. Ten of the 12 gene sets had a P value of <0.05 in the raw data level meta-analysis of stage 3 with association in the same direction as the original signal (Table 2).
Overall, 10 of 12 gene sets analyzed in all stages showed a significant association in the same direction at each stage (Table 2). Ten gene sets reached a compelling degree of significance for association with PD in all three stages (P values better than 9.6 × 10−5 in stage 1, and better than 0.05 in stages 2 and 3), with P values below 10−5 in both the combined meta-analyses of substantia nigra stages 1 + 2 and stages 1 to 3 (Table 2).
The strongest statistical evidence for an association signal was for the electron transport chain (ETC) gene set (Table 2), one of a cluster of overlapping gene sets showing association in the stage 1 meta-analysis (−1.583, <1 × 10−8; Figs. 2 and and3A)3A) and across stage 2 (−1.496, 1.46 × 10−2; Fig. 3B) and stage 3 samples (−1.42, 1.66 × 10−5; Fig. 3C). The ETC group contains 95 genes that include all subunits of the mitochondrial respiratory chain complexes I to V encoded by the nuclear human genome (13 subunits are encoded by the mitochondrial genome). These were coordinately underexpressed in PD in laser-captured dopamine neurons (Figs. 3A and and4A),4A), in subclinical PD-related Lewy body pathology (Fig. 4, C and D), and in the meta-analysis of nonnigral neuronal and blood cells (Fig. 3C). Overall, the estimate of effect was based on 410 microarrays (221 from cases and 189 from controls). In the NBD laser-captured dopamine neuron data set, the deficit in ETC gene set expression reached P = 0.002 (Fig. 3A), suggesting that the deficit in ETC expression in PD can be detected specifically in dopamine neurons of the substantita nigra. Underscoring the robustness of the association between PD and the ETC pathway, three other gene sets that in part contain constituents of the ETC were also significantly associated with PD (Fig. 1B; termed MAP00190 oxidative phosphorylation, VOXPHOS, and GO 0005739, respectively). These results (Table 2) indicate that our screen captured the pervasive changes in nuclear-encoded ETC to saturation and underscore the robustness of this observation.
The second strongest signal for a unique gene set was for the pyruvate metabolism gene set. The pyruvate metabolism gene set (Table 2) was underexpressed in stage 1 (−1.529, P = 3.36 × 10−8) and stage 2 (−1.844, P = 2.37 × 10−2), and to a lesser degree in nonnigral stage 3 (−1.062, P = 4.59 × 10−3). This association was also revealed by the biochemically linked and overlapping Krebs-TCA gene set (Table 2 and Fig. 1B). Together, the two gene sets encode the molecular machinery controlling entry of pyruvate, the intermediary resulting from glycolysis, into the Krebs cycle. Consistent with these molecular results in early-stage disease and with the abnormal glucose utilization seen in living patients (49, 50), the carbohydrate-responsive element–binding protein (ChREBP) pathway was also significantly underexpressed across all three stages of analysis (Table 2). The genes in the ChREBP pathway are distinct from the other gene sets identified (no gene overlap; Fig. 1B). ChREBP is a glucose-sensing transcription factor that transactivates key genes of glucose metabolism.
The transcriptional coactivator PPARGC1A (PGC-1α) is a master regulator of mitochondrial biogenesis and oxidative metabolism (51). In PGC-1α knockout mice, expression of genes responsible for mitochondrial respiration is markedly blunted, and mitochondrial enzymatic activities and concentrations of adenosine triphosphate (ATP) are decreased (52). Underexpression of a gene set of 425 PGC-1α–responsive genes (PGC) was significantly associated with PD pathology in the stage 1 meta-analysis (-1.366; P = 6.75 × 10−6; Fig. 3D), as well as with subclinical, mild, PD-related Lewy body neuropathology (36, 41) in stage 2 (−1.576; P = 0.0496; Fig. 3E), and across stage 3 samples (−0.884; P = 0.0165; Fig. 3F) (Table 2). Again, underexpression of the PGC-1α–responsive genes was clearly detectable in GWES of laser-captured dopamine neurons from PD cases compared with controls (Figs. 3D and and4A)4A) and thus was not due to simple differences in proportions of dopamine neurons or glia in the specimens. PGC-1α–responsive genes were also underexpressed in substantia nigra of subclinical PD-related Lewy body neuropathology (36, 41) (Fig. 4, C and D). Thirty-one ETC genes are PGC-1α–responsive and thus are annotated both to the 95-gene ETC and the 425-gene PGC pathway. Even after omitting all 31 PGC-1α–responsive ETC genes from the PGC gene set, PGC was still robustly associated with PD (NES = -1.27, P = 6.59 × 10−5; −1.60, P = 0.05; and −0.8, P = 0.036, respectively, for the three stages). This indicates that PGC-1α–responsive genes other than those controlling ETC that are involved in mitochondrial protein import, mitochondrial protein folding, and mitochondrial translation were contributing to these association signals. Consistently, two additional gene sets that encode mitochondrial biogenesis genes, termed “mitochondr” and “human mitoDB 2002” (Fig. 1B; 108 of 447 and 111 of 428 genes, respectively, overlap with the PGC gene set), were associated with PD in all three analysis stages (Table 2). Collectively, this evidence indicates that PGC-1α–responsive genes are perturbed in PD and implicates PGC-1α–controlled ETC and PGC1α–controlled mitochondrial bioenergetics defects in early and advanced molecular pathology in the substantia nigra, as well as in nonnigral tissues.
To confirm the underexpression of ETC and PGC-1α–responsive genes on a third gene expression platform in addition to the Illumina HumanHT-12v3 bead arrays (stage 2) and Affymetrix solid-phase arrays used in stages 1 and 3, we performed quantitative real-time polymerase chain reaction (qPCR) based on precise 5′ nuclease chemistry for 19 nuclear-encoded ETC genes selected on the basis of the microarray results (including 10 PGC-1α–responsive genes). Underexpression of these subsets of ETC and PGC-1α–responsive ETC genes was independently confirmed by qPCR in substantia nigra samples of a new population of 13 patients with PD and 17 age-, sex-, and PMI-matched controls without neurodegenerative disease, with P = 3.8 × 10−6 and P = 0.002, respectively, by a binomial test (Fig. 4B). Next, underexpression of the ETC and PGC-1α–responsive ETC genes was replicated by qPCR in 15 individuals with incidental Lewy body pathology and 17 age-, sex-, and PMI-matched control subjects used in stage 2, with P = 3.8 × 10−6 and P = 0.002, respectively, by a binomial test (Fig. 4D). These two qPCR studies confirmed the pathway changes measured by GWESs.
PGC-1α regulates a fundamental transcriptional cycle that modulates mitochondrial function and provides homeostatic control of cellular ATP (53). We investigated whether overexpression of PGC-1α coactivates transcription of nuclear-encoded electron transport genes and genetically blocks α-synuclein toxicity in a well-established cellular model of α-synucleinopathy (54–57). The model consists of primary cultures prepared from the midbrain region of E17 (embryonic day 17) rats. Transduction of the cells with adenovirus encoding α-synuclein that carries a human PD-linked mutation (A53T) results in a general loss of neurons, as manifested by a decrease in the number of cells that stain positive for the general neuronal marker MAP2 (56, 57). Dopaminergic (TH-positive) neurons are more vulnerable to the toxic effects of A53T overexpression than other neurons (for example, GABAergic neurons) in the cultures (55, 56). Therefore, with the primary midbrain cell culture model, one can monitor the preferential toxicity of α-synuclein to dopaminergic neurons relative to other neurons (and this preferential toxicity is thought to be relevant to PD pathogenesis). We found that cotransduction with adenovirus carrying human PGC-1α (but not the control LacZ gene) activated the expression of endogenous genes encoding nuclear subunits of the mitochondrial respiratory chain complexes I, II, IV, and V in neurons overexpressing A53T–α-synuclein (Fig. 5A). Moreover, the loss of TH-positive neurons induced by A53T–α-synuclein was rescued by cotransduction with adenovirus encoding human PGC-1α (P < 0.01; Fig. 5B). Furthermore, PGC-1α overexpression abrogated the A53T–α-synuclein–mediated retraction of MAP2- and TH-positive neuronal processes (Fig. 5C). These results indicate that expression of PGC-1α can up-regulate nuclear-encoded subunits of the mitochondrial respiratory chain and alleviate α-synuclein neurotoxicity in the primary midbrain culture model.
PD is a complex disease with strong environmental risk factors including exposure to pesticides (2) such as rotenone (1). Rotenone inhibits the transfer of electrons from iron-sulfur centers in complex I of the ETC to ubiquinone. Chronic systemic complex I inhibition caused by rotenone exposure induces features of PD in rats, including nigrostriatal dopaminergic degeneration and formation of α-synuclein–positive inclusions (5). The underlying mechanisms of rotenone-induced neuronal toxicity can be evaluated in in vitro models based on treating catecholaminergic neuronal cells with rotenone (3, 4). We investigated whether overexpression of PGC-1α can ameliorate rotenone toxicity in rat primary dopamine neurons and in human catecholaminergic SH-SY5Y neuronal cells, a standard cellular system for modeling the biology of human substantia nigra dopamine neurons (16). The preferential loss of TH-positive neurons induced by exposure to rotenone was rescued by cotransduction with adenovirus encoding human PGC-1α (P < 0.01; Fig. 6A) in primary mesencephalic cultures. Overexpression of PGC-1α compared to the control gene LacZ coactivated the expression of nuclear-encoded subunits of complex I, II, III, IV, and V of the mitochondrial ETC (Fig. 6B) and induced a small but statistically significant increase in viability of human catecholaminergic SH-SY5Y cells exposed to rotenone (14% increase; P = 0.02; Fig. 6C). The MTT assay, which measures the reduction of 3-(4,5-dimethylthiazol-2-yl)-2,5-diphenyl tetrazolium bromide into formazan by cellular and mitochondrial dehydrogenases (58), was used to estimate cell viability.
Another neurotoxin, 1-methyl-4-phenyl-1,2,3,6-tetrahydropyridine (MPTP), causes selective death of dopaminergic neurons in the substantia nigra pars compacta (SNpc) and parkinsonism in humans, and is widely used to model PD in rodents and primates (6, 59). Consistent with our human and in vitro data, St-Pierre and colleagues provided evidence in mice that genetic ablation of the PGC-1α gene markedly enhances MPTP-induced loss of TH-positive neurons in the substantia nigra (60). MPTP exposure caused a 61% loss of TH-positive neurons in the substantia nigra of PGC-1α knockout mice but only a 12% loss in wild-type mice (60).
By integrating a large-scale gene expression database and genome-wide pathway analysis with validation in subclinical disease and mechanistic analysis in primary dopaminergic neurons, we found that nuclear-encoded PGC-1α–responsive bioenergetics and ETC genes are coordinately underexpressed in human PD and in incipient Lewy body disease, and that PGC-1α coactivates these genes and blocks A53T–α-synuclein– and rotenone-induced dopamine neuron loss in cellular disease models.
Here, we have conducted a comprehensive assessment of currently available data on gene expression in PD, integrating 17 studies and 14 million data points of 322 human brain and 88 blood samples. To identify molecular pathways associated with PD, we combined genome-wide expression analysis of human laser-captured dopamine neurons and substantia nigra postmortem tissue from PD patients, large-scale pathway meta-analysis, and validation on multiple platforms, as well as evaluation of individuals likely to be in the earliest stages of Lewy body disease. Our study found that decreases in expression of 10 gene sets are associated with PD, even in probable subclinical disease and in tissues outside the substantia nigra. These 10 gene sets encode proteins responsible for four distinct, but interconnected, cellular processes: nuclear-encoded mitochondrial electron transport, mitochondrial biogenesis, glucose utilization, and glucose sensing. We show that bioenergetics genes responsive to the master regulator PGC-1α, including nuclear-encoded ETC genes, are underexpressed in patients with PD and in incipient Lewy body disease. In addition, coactivation by PGC-1α up-regulates nuclear subunits of mitochondrial respiratory chain complexes I, II, III, IV, and V and blocks dopamine neuron loss in cellular models of PD-linked α-synucleinopathy and rotenone toxicity. Consistent with a mechanistic role in the onset of the human disease proposed by our observations, genetic ablation of PGC-1α in mice markedly enhances MPTP-induced dopamine neuron loss in the substantia nigra (60). This neurotoxin is widely used to model PD in primates and rodents (61) and caused a 61% loss of TH-positive neurons in the substantia nigra of PGC-1α knockout mice but only a 12% loss in wild-type mice (60).
Physiology, pharmacology, and clinical and genetic investigation support the relationship between PD and the two main components required for energy metabolism in neurons: electron transport carriers and glucose utilization. For example, the ability of complex I to transfer electrons from NADH to ubiquinone (coenzyme Q) was impaired in mitochondria purified from the FC of five individuals with PD (62), and complex III and IV activity was also reduced, although to a lower extent (62). Complex I subunits in FC specimens from 10 patients with PD exhibited increased protein carbonyl content, indicating oxidative damage, and this inversely correlated to rates of NADH-driven electron flow in the tissue (63). ETC dysfunction is also found in platelets from patients with PD, defective oxidative phosphorylation may occur in muscle, and the oxidative stress marker 8-hydroxydeoxyguanosine is elevated in plasma of patients with PD (7). Complex I inhibitors such as MPTP and rotenone cause dopamine cell death and parkinsonism (7). Nevertheless, the precise contribution of complex I dysfunction in the etiology of common, sporadic PD has remained unclear (64). Genetically, mutations in the mitochondrially targeted serine-threonine kinase PTEN-induced putative kinase-1 (PINK1) and the E3 ubiquitin ligase Parkin (PARK2) cause autosomal recessive PD in rare families. Mutations in PARK2 and PINK1 lead to mitochondrial swelling in fruit flies (8, 9) and involve a pathway that links ubiquitylation with selective autophagy of damaged mitochondria (65). A relationship between glucose utilization and PD is supported in living patients. Magnetic resonance spectroscopy (49) and 2-[18F]fluoro-2-deoxy-D-glucose PET studies (50) demonstrate increased lactate concentrations and glucose hypometabolism, consistent with a general shift to anaerobic glycolysis in the neocortex of PD patients.
On the basis of our data presented here, we hypothesize that PD is characterized by pervasive, coordinated, nuclear-encoded cellular energetic defects to which nigral dopamine neurons are intrinsically more susceptible than other cells. Complex I deficiency in PD may be a biochemically detectable “tip of the iceberg” of a deeper molecular defect comprising the entire nuclear-encoded ETC. Underexpression of PGC-1α–controlled genes involved in cellular energetics might represent a common link for these diverse manifestations of defects in mitochondrial biogenesis and energetics, and abnormal glucose utilization. If this hypothesis is valid, it would suggest that modulation of cellular energetics could be used to prevent or treat PD, and that monitoring cellular energetics could serve as a diagnostic tool.
The results of this study should be interpreted bearing in mind its limitations. First, although every effort was made to ascertain all appropriate publications, it is possible that some were missed. Second, undetected publication bias, which could arise from the underreporting of studies that show a negative finding, may confound the results of a meta-analysis. However, several provisions in this study make this unlikely. We identified and included five unpublished GWES data sets. We were the first to analyze pathway enrichment by using GSEA in each of the studies (in other words, all individual GSEA results are new and thus did not directly influence publication decisions). In addition, we examined the effect of leaving out the first as well as any other nigral GWES study published on the effect estimates (none found on replicated gene sets; fig. S2). Third, the number and quality of annotated gene sets is evolving. The gene sets evaluated in this study likely represent only a subset of all true biological pathways or processes and are subject to updates as biomedical knowledge increases. Because pathway encyclopedias are systematically expanded, this PD gene expression database will be a useful resource for testing new gene sets for association with human PD. Fourth, the staged approach used is designed to yield true-positive results by extensively validating candidate pathways identified in stage 1. It is, however, prone to false negatives because only gene sets robustly associated with late-stage disease are forwarded for replication in subjects with incipient pathology. This tiered approach cannot detect gene sets exclusively associated with initial, but not late-stage, pathology. Larger sample sizes of high-quality incidental Lewy body cases will be needed to systematically identify all molecular pathways associated with early pathology. Fifth, expression analysis of postmortem substantia nigra alone cannot distinguish between associations reflecting the molecular pathobiology of PD versus proliferation of glia or depletion of dopamine neurons in patient tissue. To correct for this bias, we used laser-capture of dopamine neurons. In an analysis of individual dopamine neurons, both ETC and PGC pathways were strongly underexpressed. In our study, pathway associations were further evaluated in subclinical PD-related Lewy body disease (36, 41), in which there is only mild loss of dopamine neurons (35, 39), as well as in brain regions and blood cells that show histopathological or biochemical changes in PD, but not cell loss. Collectively then, analysis of laser-captured dopamine neuron data sets, recapitulation in early disease, and tissues not subject to neuron loss in PD indicate that the pathways here associated with PD are not materially confounded by glial proliferation or depletion of dopamine neurons.
Methods like the meta-GSEA approach we have used here will be useful for identifying disease-linked pathway signals in other diseases by integrating measurements of multiple genes and multiple genome-wide expression studies. This may be necessary when the molecular processes leading to common complex diseases result from modest variation in the expression of multiple members of a pathway, when both environmental and genetic contributions are integrated in the pathway signature, or when access to biospecimens is limited.
The coordinating principal investigator of a Michael J. Fox Foundation genetics consortium award invited the corresponding authors of genome-wide expression studies in PD published in the English language and representing eight or more arrays to serve as collaborating investigators of the Global PD Gene Expression Consortium and to contribute unprocessed raw data (for example, CEL files) underlying their published work as well as unpublished data. These publications were identified via PubMed and Gene Expression Ominbus (GEO) searches by using the terms “Parkinson’s disease” and “gene expression” or “transcriptional profiling,” as of January 31, 2008. The collaborating investigators were asked to contribute clinical and raw genome-wide expression data to the bioinformatics core. In total, 17 microarray data sets (Table 1) were analyzed: 11 data sets were identified through PubMed (15, 17–23), 1 through the National Brain Databank, 4 new data sets were contributed by co-investigators, and 1 was generated for the stage 2 replication study. Eleven of 17 studies were previously published (15, 17–23), 5 are newly published in the current study, 1 was unpublished at the start of our analysis, but has since been published elsewhere (24). Affymetrix. CEL files from all data set were normalized to “all probe sets” in a standardized matter and scaled to 100 by the MAS5 algorithm implemented in the Bioconductor package (66).
GSEA is a nonparametric method (25, 26) to determine whether a pre-defined gene set is enriched at the top or bottom of a list of all genes assayed rank-ordered by their association with the phenotype (using an appropriate metric, generally signal-to-noise ratio). GSEA was performed for each of the 17 data sets as described (25, 26) (see also Supplementary Material). For each gene set, an enrichment score was calculated, which is a normalized Kolmogorov-Smirnov statistic. To adjust for different sizes of gene sets, a positive (negative) normalized enrichment score (NES) was computed (25, 26).
Meta-analysis provides a means to quantitatively combine results from several studies on a specific topic. We used the NES as estimate of effect size and combined these estimates across studies by adapting a random-effects model meta-analysis statistic that accounts for both study variance and between-study variance (30). Results of a fixed-effect meta-analysis strictly apply to the studies involved in the given meta-analysis, whereas random effect model estimates such as we used here are intended to make inferences to a superpopulation of studies from which those actually analyzed were randomly or at least representatively selected.
The random-effects model summary effect-size estimate of a gene set ( ) - the sNES - is calculated as follows:
where Ti is the NES in the ith data set and is an estimated optimal weight that is the reciprocal of the variance and given by:
where . is the weighted mean of NESs for one gene set, wi is reciprocal of vi and k is the total number of studies.
The variance of the random-effects is given by the reciprocal of the sum of the random-effects weights:
The 95% Confidence interval (CI) for the mean effect μ is calculated by:
where zα/2 is the two-tailed critical region of the standard normal distribution; for α = 0.05 and 95% confidence intervals, zα/2 = 1.96.
For each data set, we bootstrapped case and control samples separately and recomputed the NES of a gene set for each iteration. This procedure was repeated 5000 times. Because the distribution of the 5000 bootstrapped samples was bimodal, we randomly selected 1000 positive (or negative) NES from the 5000 bootstrap samples if the observed NES was positive (or negative) (25). The variance of a gene set was estimated as the observed variance of the bootstrap-generated frequency distribution of the 1000 NESs.
We used a permutation procedure to generate a random null distribution of a sNES for a gene set. First, we generated two 522 × 9 matrices: one containing the NES and one containing the weights for the 522 gene sets and 9 nigral data sets. Then we randomly but concordantly permuted the two matrices so that, at each permutation, a weight was always yoked to its corresponding NES. A sNES was recomputed by the meta-analysis method described above. This procedure was repeated 100 million times. The P value for an observed sNES was the proportion of sNES in the permutation-generated frequency distribution that exceeded the observed sNES in absolute value. The same procedure was applied for the meta-GSEAs across stage 3, as well as across the 17 microarray data sets, and across 10 substantia nigra microarray data sets, respectively.
For the PD-LBN GWES (Table 1 and table S3), substantia nigra tissue was obtained from snap-frozen human substantia nigras of 16 individuals with a clinicopathological diagnosis of incidental Lewy body disease. These individuals had incidental Lewy body pathology on neuropathologic examination, with Lewy bodies detected in some cases in the olfactory bulb only, or in brainstem nuclei of locus coeruleus and substantia nigra only, but not in the neocortex, consistent with early stage PD. Seventeen age-, sex-, PMI-, and RNA integrity number–matched controls, who were clinicopathologically within normal limits for age (see Table 1 and table S3 for detail), were included. Human brain samples used for the stage 2 GWES, as well for the two qPCR validation studies (Fig. 4, B and 4D), were collected under the Brain and Body Donation Program at Sun Health Research Institute or obtained from the Harvard Brain Tissue Resource Center at McLean Hospital and the Massachusetts Alzheimer Disease Research Center Tissue Resource Center. All protocols were approved by the Institutional Review Board of Brigham and Women’s Hospital.
To quantify expression changes in individual SN pars compacta (pc) neurons proper in the Middleton-1 GWES (Table 1 and Fig. 2), we used a laser microdissection instrument (AS-LMD, Leica) to isolate SNpc cells for RNA isolation. Control and PD subjects were matched for age and PMI. Multiple 16-μm sections of midbrain from each subject were obtained on a cryostat at −20°C and mounted on the membrane of PEN (polyethylene naphthalate) foil slides (Leica). Adjacent slides were taken and stained for cytoarchitectural visualization (Nissl staining) of the SNpc. A total of about 80 to 200 darkly pigmented neurons were obtained from the SNpc in five to eight sections from each subject. These cells were readily visualized in the unstained frozen sections, where they were manually outlined with a computer mouse at x200 magnification using the LMD 6000 AVC software (Leica). After outlining, the LMD apparatus automatically dissected the identified cells free of the PEN foil slide using a highly focused laser beam that followed the outlined path and allowed the cells to fall into a PCR tube cap containing 30 μl of RLT Lysis Buffer from the RNeasy Mini Kit (Qiagen). To maximize RNA integrity, we obtained all of the cells of interest from each frozen section within 10 min.
For the PD-LBN GWES of incidental Lewy body cases and controls (stage 2; Table 1), RNA was isolated and quality-controlled by Agilent Bioanalyzer as described (16). Only RNAs with preserved ribosomal peaks on electropherogram were forwarded for analysis, and RNA integrity numbers of cases and controls were matched. Total RNA (350 ng) was reverse-transcribed from each substantia nigra sample and hybridized to Illumina HumanHT-12v3 Expression BeadChips targeting more than 25,000 annotated genes with 48,803 probes derived from the National Center for Biotechnology Information (NCBI) Reference Sequence RefSeq (Build 36.2, Rel 22) and the UniGene (Build 99) databases, and scanned on a BeadArray Reader. Data were processed, normalized by “average normalization,” and quality-controlled with GenomeStudio. Procedures for the Middleton-1, Middleton-2, Middleton-3, and Miller GWES are described in the Supplementary Material.
All 17 GWES data sets have been submitted to publicly available databases under GEO accession numbers GSE6613, GSE7621, GSE8397 (two data sets), GSE20141, GSE20146, GSE20153, GSE20159, GSE20163, GSE20164, GSE20168, GSE20291, GSE20292, GSE20314, GSE20333, and GSE24378; and National Brain Databank accession name “Parkinson’s.”
We thank A. J. Ivinson and C. R. Vanderburg (Harvard NeuroDiscovery Center) and B. M. Spiegelman (Dana-Farber Cancer Institute) for insightful comments. We thank the Harvard Brain Tissue Resource Center at McLean Hospital (supported by NIH grant R24MH068855), the Massachusetts Alzheimer Disease Research Center Tissue Resource Center at Massachusetts General Hospital, and the Sun Health Research Institute Brain and Body Donation Program of Sun City, Arizona (supported by P30AG19610 Arizona Alzheimer’s Disease Core Center, Arizona Biomedical Research Commission, Prescott Family Initiative of the Michael J. Fox Foundation) for brain samples from subjects with incidental Lewy body disease, PD, and controls. See Supplement for further acknowledgments. Funding: This work was seeded by an Edmond J. Safra Global Genetics Consortium Award from the Michael J. Fox Foundation (C.R.S.) and supported by a Paul B. Beeson K08AG024816 from the National Institute on Aging and the American Federation for Aging Research (C.R.S.), NIH grants R01NS064155, R21NS060227, and P01NS058793 (all to C.R.S.), NIH grant R01NS049221 (J.-C.R.), the M.E.M.O. Hoffman Foundation (C.R.S.), the RJG Foundation (C.R.S.), the Parkinson’s Disease Foundation (J.-C.R.), the Parkinson’s Disease Society of the UK (M.B.G.), Nationales Genomeforschungsnetz (Bundesministerium für Bildung und Forschung 01GS0115; TP9) (U.W.), and Brain Net Europe II (P.R.)
Accession numbers: All 17 GWES data sets have been submitted to publicly available databases under GEO accession numbers GSE6613, GSE7621, GSE8397 (two data sets), GSE20141, GSE20146, GSE20153, GSE20159, GSE20163, GSE20164, GSE20168, GSE20291, GSE20292, GSE20314, GSE20333, and GSE24378; and National Brain Databank accession name “Parkinson’s.”
Author contributions. C.R.S. served as co-ordinating principal investigator of the Global PD Gene Expression Consortium, designed the study, performed and supervised experiments, wrote the paper, and contributed funding. J.-C.R. designed and supervised experiments, analyzed data, wrote the paper, and contributed funding. F.A.M. performed and supervised experiments, wrote the paper. B.Z., J.J.L., A.C.E. analyzed data and wrote the paper. J.C.H. characterized the detailed neuropathologic and clinical status of incidental Lewy body and control subjects, and provided their snap-frozen substantia nigra specimen from the Harvard Brain Tissue Resource Center. C.H.A, T.G.B. characterized detailed neuropathologic and clinical status of incidental Lewy body and control subjects, provided their snap-frozen substantia nigra specimens, contributed to writing the paper, and obtained funding for the Sun Health Research Institute Brain and Body Donation Program of Sun City, Arizona. Z.L., S.S. R., M.A.H., Y.Z.-J., P.D.K., K.A.L., M.L.W., E.G., L.B.M., S.A.M., R.M.M, I.C.-C. performed experiments and contributed data. P.R., H.J.F., U.W., S.P., M.B.Y., A.B.Y., J.M.V., R.L.D., M.B.G. designed and supervised experiments, contributed data, and revised the paper.
Competing interests: S.P. holds stock in Allergan Inc. and Biogen Idec Inc. C.R.S. has received consulting fees from Link Medicine Corp. and the Michael J. Fox Foundation. He is a scientific collaborator of DiaGenic in a study entirely funded by the Michael J. Fox Foundation and has received speaking fees from the International Movement Disorders Society, as well as being listed as co-inventor on a U.S. patent held by Brigham and Women’s Hospital relating to diagnostics for neurodegenerative diseases. None of the other authors have any competing interests to declare.