|Home | About | Journals | Submit | Contact Us | Français|
The genetics of complex disease produce alterations in the molecular interactions of cellular pathways whose collective effect may become clear through the organized structure of molecular networks. To characterize molecular systems associated with late-onset Alzheimer’s disease (LOAD), we constructed gene regulatory networks in 1647 post-mortem brain tissues from LOAD patients and non-demented subjects, and demonstrate that LOAD reconfigures specific portions of the molecular interaction structure. Through an integrative network-based approach, we rank-ordered these network structures for relevance to LOAD pathology, highlighting an immune and microglia-specific module dominated by genes involved in pathogen phagocytosis, containing TYROBP as a key regulator and up-regulated in LOAD. Mouse microglia cells over-expressing intact or truncated TYROBP revealed expression changes that significantly overlapped the human brain TYROBP network. Thus the causal network structure is a useful predictor of response to gene perturbations and presents a novel framework to test models of disease mechanisms underlying LOAD.
Complex diseases such as late-onset Alzheimer’s disease (LOAD) arise from the downstream interplay of DNA sequence variants and non-genetic factors that act through molecular networks to confer disease risk (Schadt, 2009). Despite decades of intensive research, the causal chain of mechanisms behind LOAD remains elusive. In fact, there are no effective disease modifying or preventive therapies and the only available treatment remains symptomatic; meanwhile the incidence of LOAD is expected to double by 2050 (Brookmeyer et al., 2007). Progress in LOAD research is fundamentally limited by our reliance on mouse models of severe familial/early-onset Alzheimer’s disease: therefore our primary knowledge of LOAD is in actuality based on the downstream effects of three rare mutations in APP, PSEN1 and PSEN2 (Bertram et al., 2010). While such mouse models are necessary and helpful, the cognitive deficits in these transgenic mice are less severe than those in humans and they do not exhibit equivalent neurodegeneration, which is the most accurate clinical marker of cognitive disease progression in humans. Correspondingly, attrition rates from early discovery to late drug development have been very high (Schafer and Kolkhof, 2008).
In contrast to the plethora of potential disease mechanisms detected in humans with LOAD, the search for LOAD-modifying interventions has focused primarily on compounds targeting the amyloid-β pathway. Both biological risk factors, often related to vascular health and psychosocial factors (Cechetto et al., 2008; Qiu et al., 2010), as well as genetic susceptibility play a critical role in the underlying pathophysiology of LOAD (Bertram et al., 2010). APOE is still the best validated suceptibility gene accounting for at least 30% of the genetic variance in LOAD (Corder et al., 1993). Several additional genetic risk loci for LOAD have been identified via genome-wide association studies (GWAS) that seem to cluster in patterns suggesting immunity (CLU, CR1, CD33, EPHA1, MS4A4A/MS4A6A), lipid processing (APOE, ABCA7) and endocytosis (PICALM, BIN1, CD2AP) as important causal biological processes (Bettens et al., 2013). More recently, low-frequency missense variants in APP and TREM2 were found to confer strong protection or elevated risk of LOAD (Guerreiro et al., 2013; Jonsson et al., 2012; Jonsson et al., 2013). However, the overall contribution of these new common and low-frequency variants to the heritability of LOAD is very small, suggesting that a large fraction of the genetic variance beyond the APOE risk still remains hidden. Can we clarify the pathology of LOAD by zooming out to the pathway level to search for emergent risk of many genomic contributions? If so, how can we identify the key causal genes in these pathways?
In light of the complexity and elusiveness of LOAD pathogenesis, new approaches are needed to boost the probability of identifying causal genes and pathways. Recently, we have leveraged the molecular network structure that is reflected in genotypic and gene expression data to uncover biologically meaningful gene modules involved in the development of complex disease (Chen et al., 2008; Emilsson et al., 2008). Targeting such causal networks in ways that restore them to a normal state has been proposed as a path to treat disease (Schadt et al., 2009), but this potential has never been realized for LOAD. However, the complexity of these networks makes it difficult to distinguish the causal from correlated disease effects or how the causal regulators propagate their effects. To better address this, we constructed molecular networks based on whole-genome gene expression profiling and genotyping data in 1647 autopsied brain tissues from hundreds of LOAD patients and non-demented subjects. We identified numerous modules of distinct functional categories and cellular specificity, many showing a massive remodeling effect in the LOAD brain. Next we applied an integrative network-based approach to rank-order these modules for relevance to LOAD pathology and used a Bayesian inference to identify the key causal regulators of these remodeled networks. For instance, we identified, eight causal regulators of the top-ranked immune/microglia module including TYROBP (aka DAP12) as the highest ranking in terms of regulatory strength and differential expression in LOAD brains. We demonstrate through mouse microglia cells over-expressing intact or truncated dominant-negative TYROBP that downstream expression changes significantly overlapped the human TYROBP brain network. This study presents many of the network advantages useful in identifying and prioritizing pathways and gene targets involved in the pathophysiology of LOAD.
We developed and applied an integrative network-based approach to identify modules of genes associated with neurodegenerative disease (Figure 1A-C). We processed 1647 autopsied tissues from dorsolateral prefrontal cortex (PFC), visual cortex (VC) and cerebellum (CB) in 549 brains of 376 LOAD patients and 173 non-demented healthy controls (Figure 1A). All subjects were diagnosed at intake and each brain underwent extensive LOAD-related pathology examination. We note that the known APOE genotype exposure was confirmed in the HBTRC sample, showing an odds ratio of 3.74 per copy ε4 allele (P=4.1e-13). Each tissue sample was profiled for 39,579 transcripts representing 25,242 known and 14,337 predicted gene expression traits and each subject genotyped for 838,958 unique SNPs (Figure 1A). Unless otherwise noted, gene expression analyses were adjusted for age and sex, postmortem interval (PMI) in hours, sample pH and RNA integrity number (RIN). In the overall cohort of LOAD and non-demented brains the mean±SD for sample PMI, pH and RIN were 17.8±8.3, 6.4±0.3 and 6.8±0.8, respectively. Extensive analysis of the effect of covariates on gene expression variation in LOAD and non-demented brains was carried out, as shown in Figure S1 and described in the Extended Experimental Procedures. Here, we used a robust linear regression model for covariate corrections in all our gene expression analyses (Experimental Procedures). Results of traditional differential expression analysis demonstrate that subsets of genes were up- or down-regulated in LOAD (Figure 2A). Consistent with the known progression and regional severity of LOAD pathology (Braak and Braak, 1991), we observed the PFC region contained the greatest number of differentially expressed genes (Figure 2B). Figure 2C summarizes the clustering or co-linearity of the various LOAD pathology traits and age within the HBTRC cohort, resulting in distinct groups of clinical pathology and age as separate clusters. For instance, the number of significant correlations of expression traits to neuropathology like Braak stage within the LOAD patient group was highest in the PFC region (Figure 2D). Given these observations and the fact that PFC is more commonly affected in LOAD than CB and VC (Braak and Braak, 1991), a particular attention was paid to this region in our strategy to rank-order modules for relevance to LOAD. These massive datasets were the basis of further method development with the aim to identify and rank-order network modules and gene targets associated with LOAD pathology (Figure 1A-C). Results of these various analysis steps are discussed in the sections that follow, and a more detailed description of methods and statistical procedures is found in the Extended Experimental Procedures.
For simultaneously capturing the intra- and inter-regional gene-gene interactions in the LOAD or non-demented state, we constructed multi-tissue co-expression networks consisting of the top one-third (n=13,193) of the most variable gene expression traits per brain region in individuals donating tissues from all three regions (Extended Experimental Procedures). The multi-tissue co-expression network in LOAD brains indicated strong structurally segregated regions of the human brain molecular interactome (Figure 3A), generating 111 modules in the LOAD network, each containing between 30 and 1446 gene members (Figure 3A), while the network generated from non-demented samples has 89 modules ranging in size from 30 to 2278 genes. Figure 3B highlights a direct comparison of the two topological overlap matrices corresponding to the LOAD or non-demented associated network for a subset of 16 modules, demonstrating that LOAD reconfigures specific portions of the molecular interaction structure. To analytically detect and quantify this network reorganization across the demented and non-demented states, we developed a metric we refer to as modular differential connectivity (MDC) (Extended Experimental Procedures). MDC is the ratio of the average connectivity for any pair of module sharing genes in LOAD compared to that of the same genes in the non-demented state, and is a continuous measure ranging from 0 to infinity. This module-centric measure of differential connectivity between the two states is therefore fundamentally different from the gene-centric analysis of previous studies that applied hard cutoffs (Mani et al., 2008). Given the nature of the co-expression network analysis, MDC > 1 indicates gain of connectivity (GOC) or enhanced co-regulation between genes, while MDC < 1 indicates loss of connectivity (LOC) or reduced co-regulation between genes. In extreme cases where MDC >>1, e.g. the glutathione transferase (GST) module (Figure 3B), or MDC <<1, e.g. the nerve myelination module (Figure 3B), the corresponding genes do not form a coherent cluster in the non-demented state or LOAD, respectively. Thus, new modules are created in LOAD while in other cases a portion of the network is completely disrupted. The statistical significance of the MDC metrics was computed through the false discovery rate (FDR) procedure as described in Extended Experimental Procedures. Based on 10% FDR, 54% of all modules showed GOC while 4.5% of modules exhibited LOC. The structure of the remaining 41.5% of the modules were found to be conserved across the LOAD and non-demented states by this MDC measure. We note a negligible overlap of only 6% between signatures of differential connectivity and standard differential gene expression in LOAD brains, implying that the observed disruption in co-regulation of genes reflects a previously untapped marker of LOAD neuropathology.
As observed in previous network-based studies (Chen et al., 2008; Emilsson et al., 2008; Zhang and Horvath, 2005), we find that the brain gene expression is organized into modules of distinct functional categories (Figure 3C). Over-representation of canonical pathways and biological processes in modules was measured through Fisher’s exact test (FET), corrected for number of modules and functional categories tested. Figure 3C highlights significant over-representation of functional categories in modules showing GOC, LOC or conserved connectivity and containing at least 50 genes. The multi-factorial basis of LOAD neuropathology involves biological processes active in both the CNS and the metabolic and vascular peripheral system that have often progressed silently for many years (Huang and Mucke, 2012; Murray et al., 2011). In fact, we find that multiple functional categories are highly enriched in the LOAD-associated modules, including the immune response, unfolded protein, vascular system, extracellular matrix, neurogenesis (brain development), glucose homeostasis, synaptic transmission and olfactory sensory perception categories in the GOC modules (Figure 3C), while the LOC modules are enriched for genes involved in nerve myelination, cell-cycle, γ-aminobutyricacid (GABA) metabolism, and neurotrophin signaling (Figure 3C). Many of these functional categories have previously been implicated in LOAD and/or CNS-related function (Ansari and Scheff, 2010; Cechetto et al., 2008; Dodel et al., 2003; Luchsinger, 2008; Morawski et al., 2012; Schiffman et al., 2002), again reinforcing the complex multi-factorial basis of the underlying pathophysiology. The functional categories enriched in the conserved modules included “muscle contraction” (actin-related system), coated vesicle, cadherin and zinc ion metabolism (Figure 3C).
CNS cell-type specific gene signatures, from the Allen Brain Atlas (http://www.brain-map.org/) were enriched in distinct network modules as previously observed (Oldham et al., 2008): neurons in the synaptic transmission modules (11 fold, P=3.7e-24), astrocytes in GABA biosynthesis module (22 fold, P=1.5e-15), oligodendrocytes in the nerve myelination module (30 fold, P=2.5e-30), choroid plexus cell types in the extracellular matrix module (35 fold, P=3.9e-15) and microglia signatures responding to amyloid-β treatment (Walker et al., 2006) in the immune module (10 fold, P=4.5e-20) (Figure 3C). In contrast to the GOC and LOC modules, conserved modules were not enriched for any CNS specific cell types (Figure 3C). Pathways enriched in the brain modules and not previously implicated in LOAD, may therefore represent novel disease mechanisms including, for instance, the glucuronosyl transferase activity and the dynein complex (Figure 3C). Moreover, the comprehensive representation of gene-gene interactions in the LOAD-associated networks can uncover novel gene members in pathways already implicated in LOAD, thus allowing us to work out a known pathologic mechanism in more detail than ever before. In summary, the immune module shows the statistically most significant functional enrichment of all modules (Figure 3C), and as such may have a more comprehensive representation of its respective pathway members.
Table S1, available online in the Supplementary Data, contains extensive information regarding the functional enrichment and gene membership of modules containing at least 50 unique gene symbols. We highlight some specific findings of interest from Figure 3C: (1) the enrichment of pathways related to olfactory sensory perception in a LOAD-associated module is of interest given the processing of olfactory function is affected in subjects who are genetically at risk of developing LOAD long before the symptoms of dementia are manifested (Schiffman et al., 2002); (2) the APOE transcript is located in the LOC module enriched for astrocyte signatures and GABA metabolism, consistent with the observation that astrocytes are the major source of APOE in the CNS (Boyles et al., 1985). The close connectivity of APOE and GABA metabolism in the brain network, may therefore have some bearing on the observation that GABA interneuron dysfunction is particularly severe in APOE4 carriers (Li et al., 2009); (3) the previously identified macrophage enriched metabolic network (MEMN) in peripheral tissues and strongly supported as causal for a number of metabolic and vascular traits related to obesity, diabetes and heart disease (Chen et al., 2008; Emilsson et al., 2008), is remarkably enriched within the brain immune/microglia module (3.9 fold, P= 2.4e-46). This is of interest given the strong epidemiological evidence for metabolic- and vascular-based exposure on LOAD (Huang and Mucke, 2012; Murray et al., 2011); (4) The postsynaptic density proteome in the human neo-cortex of 748 proteins over-represented with risk loci known to underlie cognitive, affective and motor phenotypes (Bayes et al., 2011), was significantly enriched in the synaptic transmission module (3 fold, P = 1.6e-32). It is still unclear how and which of these different biological processes mentioned above interact to affect LOAD, however it is likely that only few downstream mechanisms on which many diverse effects converge are causally related to LOAD (Huang and Mucke, 2012; Murray et al., 2011). The accumulated data show a strikingly coherent organization of molecular processes in the LOAD-associated network.
The co-expression network structure, its changes between non-demented and LOAD brains, and the genetic loci responsible for the expression co-variation behind these networks, collectively reflect molecular processes associated with LOAD. By linking the network modules to clinical outcome or LOAD neuropathology via a multiple regression analysis (Extended Experimental Procedures), we can infer key molecular processes associated with LOAD. A covariance matrix of the average expression correlation (r) between 49 modules, comprised of at least 100 probes, and 25 LOAD-related traits is shown in Figure 4A. We performed principal component analysis (PCA) to estimate the module-trait correlation and used the FDR method to assess the significance (see Extended Experimental Procedures). Of all modules, the immune/microglia showed correlation to the greatest number of LOAD-related neuropathology traits (Figure 4B). Expression of the PFC immune/microglia module correlated to atrophy levels in multiple brain regions, including frontal cortex (r=0.27, FDR=0.018), parietal (r=0.20, FDR=0.016), temporal (r=0.19, FDR=0.022) and neostriatum regions (r=0.28, FDR=3.3e-09) as well as ventricular enlargement (r=0.17, FDR=0.031). Several modules, however, showed correlation to a more restricted type of neuropathology, including the modules characteristic for the glucuronosyl transferase correlated to Braak stage (r=0.18, FDR=9.8e-05), NAD(P) homeostasis to Braak stage (r=0.25, FDR=1.4e-07), neurogenesis to ventricular enlargement (r=0.19, FDR=5.1e-05) and GST to ventricular enlargement (r=0.22, FDR=4e-06). The significance of functional enrichment in modules and the number of neuropathology traits correlated with modules were considered important criteria in rank-ordering modules for their potential to affect LOAD.
Causal probabilistic Bayesian networks were constructed and used as an alternative approach to delineate potential regulatory mechanisms. In order to establish a causal relationship or dependency between nodes in the network, we constructed a directed probabilistic Bayesian network through the application of brain cis expression (e)SNPs as causal anchors. Because cis eSNPs are in LD with causal variants that affect the expression levels of a neighboring gene or they are the causal variant themselves, they serve as an excellent source of natural perturbation to infer causal relationships among genes and between genes and higher order phenotypes like disease (Chen et al., 2008; Emilsson et al., 2008). We detected a total of 11,318 unique cis eSNPs-transcripts in the three brain regions, at FDR of 10% (Figure S2A), which is the largest number of brain eSNP-transcripts detected to date in a single study (Webster et al., 2009). The methodology to identify cis and trans eSNPs is detailed in Extended Experimental Procedures, while Table S1 lists all cis and trans acting eSNPs detected in the present study at FDR of 10%. There was between 70 and 80% sharing of cis eSNP-transcripts between different brain regions and 37% overlapped all brain regions (Figure S2A). Importantly, we find a variable and often strong enrichment of brain eSNPs in many of the LOAD-associated modules compared to all probes on the array, suggesting the possibility that these variants determine the differential connectivity observed in LOAD. For instance, in the PFC region (Figure 4C) there were five modules showing significant enrichment for cis eSNPs including the unfolded protein (3.8 fold, P=3.8e-81), nerve myelination (2.5 fold, P=2.9e-40), immune function (2.2 fold, P=4.3e-30), GABA metabolism (2.7 fold, P=2.3e-13) and extracellular matrix (1.6 fold, P=2.3e-07) modules (Figure 4C). The enrichment of cis eSNPs in the differentially connected LOAD modules in the VC and CB regions is shown in Figure S2B. For the present study, however, a particular attention was paid to the cis eSNPs for their applicability as priors in the construction of Bayesian networks (Extended Experimental Procedures and schematic Figure S3).
We constructed Bayesian networks for each co-expression module. While many of the LOAD-associated network modules are of potential interest, the reconstruction of the Bayesian network for the immune/microglia module is highlighted given it has the strongest disease association based on clinical covariates and network-associated properties: (1) significant differential connectivity of the cortex specific immune modules in LOAD (MDC between 49 and 100% GOC at FDR < 0.001); (2) the immune/microglia module showed the most significant enrichment of functional categories; (3) the highest degree of gene expression correlation to several measures of LOAD neuropathology; (4) the PFC version of the module was highly enriched for brain eSNPs. To increase the predictive power of inflammation related regulatory networks, we further built up the directed Bayesian network for the inflammation modules derived from the individual brain regions. Figure 5 highlights the interactions within and between the five predominant immunologic families in the PFC-based putative microglia module. To generate this roadmap to the complex structure of the immune/microglia module, genes which were not direct members of one of these five core pathways were assigned to the family with which they have the greatest number of causal interactions. The immune module was dissected into five families representing functional immune pathways that were labeled according to their main function as ‘Complement’, ‘Fc’ for Fc-receptors, ‘MHC’ for major histocompatibility complex, ‘Cytokines’ for cytokines/chemokines and ‘Toll-like’ for toll-like receptors (Figure 5).
The Bayesian inference enabled us to compute the causal regulators of the differential connectivity in individual modules, defined as the genes controlling many downstream nodes in the respective network (see Extended Experimental Procedures). The causal regulators of the highest scoring immune/microglia module were rank-ordered based on the number of downstream nodes, i.e. the power of regulating other genes, as well as differential expression in LOAD brains. Here, we used a combined score as , where, gji is the discriminant value of a j in the case i, and is defined as (Duda et al., 2000). In comparison to the average gene/node in a given network the causal regulators are expected to have a stronger effect on the clinical outcome as they direct the expression of a significant portion of the network module they reside in. The size of the gene membership for the different regional specific immune modules ranges from 386 in CB to 1108 in the PFC, with 247 of the genes in the CB detected in all regions (P=1e-19). The identity of the key causal regulators is somewhat variable across each brain regional version of the microglia module of which CTSC, HCK, TYROBP, SERPINA1, S100A11, LY86, DOCK2 and FCER1G were common to all immune modules, regardless of brain region. Through the combined ranking score based on regulatory strength and differential expression in PFC of LOAD brains, TYROBP scored the highest (Figure S4A). Table 1 lists the 20 top ranking PFC modules and their respective key causal regulators. Expression of TYROBP is restricted to cells involved in the innate immunity including the microglial cells in the brain (Schleinitz et al., 2009). Here, TYROBP was significantly up-regulated in LOAD brains in the HBTRC sample (1.18 fold, P = 0.028) and the direction of this effect was replicated (1.17 fold, P = 5.1e-05) in an independent multi-center cohorts study (see Extended Experimental Procedures and Figure S4B). Additionally, we observed a progression of TYROBP expression changes across mild cognitive impairment (MCI) in the replication study (Figure S4B). Estimating what constitutes a “large” or “small” change in gene expression levels is challenging in microarray analyses. We note, however, that TYROBP was the 124th most differentially-expressed probe out of 48,803 probes assayed in the replication study cohort. Moreover, TYROBP was more differentially expressed in LOAD brains than the classical markers of microglia AIF1, and CD68, indicating there was not a relative down-regulation of TYROBP despite elevated microgliosis in LOAD brains (Perry et al., 2010).
The majority of the common causal regulators were located either in the ‘Fc’ pathway and associated/clustered genes (HCK, SERPINA1, S100A11, DOCK2 and FCER1G) or the ‘Complement’ pathway (TYROBP) in the immune/microglia network (Figure 5). Recent reports (we note that our submission predates these reports) show a striking association of a low-frequency DNA variant in TREM2 to LOAD (Guerreiro et al., 2013; Jonsson et al., 2013). More specifically, TREM2 is known to associate and signal via TYROBP, the key regulator of the immune/microglia network activated in LOAD. Thus our data-driven network-based approach places both TREM2 and TYROBP in a gene network that literally unifies them with previous top GWAS risk loci including MS4A4A, MS4A6A and CD33 (Figure 5). These new results provide exciting convergent evidence for the specific microglia network that we had directly implicated as activated in LOAD, and reinforce the potential causality of this pathway in LOAD pathology. In fact, the dissection of the immune/microglia module into distinct families and key causal regulators point towards an important function of the microglia pathways involving genes of the ‘Complement’ and/or ‘Fc’ network clusters. Figure S5 (genes marked in red) highlight many of the key genes in the pathogen phagocytosis pathways found in the immune/microglia module. It is noteable how comprehensive representation of a specific signal transduction pathways is observed within the two immune families of this module. The strategic network position of TYROBP as a causal regulator of many genes mirrors its bottleneck position in several microglia activation signaling cascades. Extrapolating from this data-driven interaction, it is possible that TYROBP may be associated with neuronal pruning activity of the complement system that may be reawakened in LOAD via amyloid-β and Tau aggregates (Stevens, 2007; Perry, 2010). In this manner, the network structure can become a data-driven hypothesis generator for novel disease-relevant interactions.
To test our prediction that TYROBP can direct LOAD-associated gene networks, we contrast both the molecular function and genome-wide effects of TYROBP with those predicted by the structure of causal networks inferred from human LOAD brains. For this, microglia cells derived from mouse embryonic stem cells were genetically modified by lentiviral vectors to over-express either full length or a truncated version of Tyrobp which lacks both intracellular ITAM motifs (Extended Experimental Procedures and Figure S6). To assess the genome-wide gene expression changes in response to the perturbation of Tyrobp, we derived gene expression data from the RNA sequencing of mouse microglia cell lines over-expressing (1) vehicle, (2) the full length Tyrobp or (3) dominant negative truncated Tyrobp. We identified 2638 and 3415 differentially expressed genes for the over-expression of full length Tyrobp and truncated Tyrobp, respectively (Table S1), at FDR < 2.5%. Roughly one-third (858 to 1092) of these genes are found in the most variable gene set in the brain dataset used for the network reconstruction. The PFC variant of the human immune/microglia module was highly enriched for genes which are differentially expressed in the full length or truncated Tyrobp experiments (P < 1e-15) (Figure 6A). We projected results of RNA sequencing experiments onto a large Bayesian brain network of ~8000 nodes that contains the microglia module as well as many other modules. In this large network, we could track differential expression of genes which are predicted to be downstream of TYROBP at various network path distances (Figure 6B). The highest predictive power for differential expression is in the primary neighborhood of the perturbed gene, and this power decreases for genes which are farther away in the network. The enrichment for differentially expressed genes in the network neighborhood of TYROBP and strong negative correlation between the fraction of confirmed targets and path distance (r= −0.82, P=4.e-07) (Figure 6B), show that our causal network structure is a significant and useful predictor of response to gene perturbations, even in a challenging cross-species setting. Thus both the structure and direction of links in these causal networks provide significant information on the effects of complex signal transduction mechanisms.
The inferred network structure has significant predictive power for nodes which are several links away from TYROBP. We studied the enrichment of functional categories in the gene sets responding to the Tyrobp perturbation experiments, and applied Bonferroni-corrected P-values for statistical significance (Extended Experimental Procedures). Approximately 99% of the differentially expressed genes from the microglia over-expressing intact Tyrobp were down-regulated compared to the control vehicle. This set was enriched for genes involved in RNA metabolism (P=6.2e-05) and cell-cycle mitosis (P=2.7e-03). In the microglia cells over-expressing the dominant negative truncated Tyrobp, 2856 up-regulated genes were enriched for the vacuole/autophagy (P=1.7e-08) and mitochondrion (P=4.6e-04), while 559 genes involved in histone assemply (P=1.6e-31) were down-regulated. Moreover, the Tyrobp regulatory effect reflects a degree of symmetry as 658 genes, related to the vacuole/autophagy (P=5e-03), were down-regulated by active Tyrobp but up-regulated in cells expressing dominant negative truncated Tyrobp. These findings are of interest because they link the far down-stream effects of TYROPB to known molecular pathology in LOAD, such as abnormalities in the cell-cycle, mitochondrion and autophagy (Coskun et al., 2004; Webber et al., 2005). The accumulated data suggest that TYROBP may be a therapeutic target in prevention of neuronal damage in LOAD.
The construction of gene regulatory networks in a large sampling of human brain specimens has revealed many facets of the molecular interaction structure in LOAD, when compared to that in non-demented brains. A comprehensive characterization of gene network connectivity, its regulation and association to disease can provide critical insights into the underlying mechanisms, and identify genes that may serve as effective targets for therapeutic intervention. For instance, targeting genes that are the most central (highly connected) may be more effective in disrupting disease-related networks for the purpose of therapy, but that could be at the cost of more adverse effects. In summary, the utility of network-based approaches to complex disease includes: (1) elucidating the biological function and molecular context of a particular set of causal genes, (2) establishing a framework to map interaction between genes and network modules, (3) providing an objective filter for rank-ordering genes based on connectivity or other network features, (4) defining dynamic changes and corresponding causal regulators of the altered network structure associated with disease condition, (5) identifying modules and pathways causally related to disease, and (6) revealing tissue-to-tissue interactions that can aid in the identification of key target tissues for disease (Dobrin et al., 2009). The present study utilizes many of these network advantages to highlight and prioritize pathways and gene targets causally related to LOAD.
Our network-based integrative analysis not only highlighted the immune/microglia module as the molecular system most strongly associated with the pathophysiology of LOAD, but identified the key network regulators including TYROBP. In a separate in vitro study we have found that the microglia expressed TYROBP is directly involved in amyloid-β turnover and neuronal damage (unpublished results). Of interest, mutations in TYROBP or TREM2 cause Nasu-Hakola disease (Bianchin et al., 2010), a rare Mendelian disease characterized by bone reabsorption dysfunction and chronic inflammatory neurodegeneration leading to death in the fourth or fifth decade of life. The exact pathomechanism underlying Nasu-Hakola disease is still unclear, but it was hypothesized that failure of proper microglial clearance is causal for the lethal effect of neurodegeneration. Thus dysfunctional immune/microglia pathways might not be unique to LOAD. To test the generalization of this concept, we explored the connection of the immune/microglia module to Huntington disease (HD), another neurodegenerative disease. HD pathology, caused by expanded alleles of a variable stretch of trinucleotide (CAG) repeat length in HTT (The Huntington’s Disease Collaborative Research Group, 1993), features astrogliosis and neurodegeneration of the striatum, prefrontal cortex and hippocampus. We constructed molecular networks in the PFC from 194 HD patients genotyped for CAG allele size (see Extended Experimental Procedures) and found that the PFC version of the immune/microglia module was well conserved between LOAD and HD in terms of gene annotation (75% overlap, P-value <1e-300). This module, however, did not show any alteration in connectivity in HD brains compared to the disease-free controls used in our LOAD study. Moreover, through a PCA we did not detect any gene expression correlation of the HD brain immune/microglia module to expanded CAG repeat length (r=-0.05, FDR=56%), a key biomarker for predicting HD severity (Gusella and MacDonald, 2006). Thus based on the comparison to HD, the disease-related effect of the immune/microglia module appears to be specific to LOAD (and possibly Nasu-Hakola disease).
Immune activation in LOAD may have multi-faceted activity: long-term use of non-steroid anti-inflammatory drugs (NSAIDs) before onset of the disease decreases risk (Etminan et al., 2003), and microglia effector function via interfering with reactive oxygen production, cytokines and complement cascade members have been postulated to damage healthy neurons and synapses (Cameron and Landreth, 2010). Close association and positive feedback between amyloid-β and microglia (Meyer-Luehmann et al., 2008) further clouds the cause and effect relationships of inflammation to disease progression. Without a causal framework for these observations, it is difficult to find optimal molecular targets that direct LOAD inflammation. Therefore, we integrated clinical factors with whole-genome genotype and molecular trait data to identify a network module containing several microglia signaling cascades functionally related to the reactive oxygen burst during pathogen phagocytosis. We highlight the causal regulator TYROBP that exerts control over multiple genes within this module and pathways involved in LOAD, thus validating our network structure and its relevance to LOAD pathology. This approach appears to offer novel insights for drug discovery programs that can affect neurodegenerative diseases, such as LOAD.
Raw gene expression data together with information related to demographics, disease state and technical covariates are available via the GEO database (GEO accession number GSE44772; GSE44768, GSE44770 and GSE44771). A brief description of key methods and sample description is provided below while complete details are found in the Extended Experimental Procedures.
We compiled six disease- and tissue-specific gene expression datasets consisting of 1647 postmortem specimens from three brain regions (PCF (BA9), VC (BA17) and CB) in LOAD and non-demented subjects recruited through the Harvard Brain Tissue Resource Center (HBTRC). Each subject was diagnosed at intake and via extensive neuropathology examination. Tissues were profiled on a custom-made Agilent 44K array of 40,638 DNA probes and each subject genotyped for 838,958 SNPs.
We constructed both multi-tissue and single tissue co-expression networks from the top one-third (n=13,193) of the most variably expressed genes in each tissue and condition. We computed the module differential connectivity (MDC) in LOAD brains as: , where, kij is the connectivity between two genes i and j in a given network, and assessed the statistical significance through the FDR method. We constructed causal probabilistic Bayesian networks from individual co-expression modules and used brain cis eSNPs as priors to infer directionality between nodes (see Figure S3). For this, we identified 11,318 unique cis eSNPs-transcripts at FDR of 10% (Extended Experimental Procedures), all listed in Table S1. The Bayesian inference allowed us to compute the causal regulators of the differential connectivity in individual modules by examining the number of N-hob downstream nodes.
Genome-wide gene expression of mRNA from cultivated microglia cells over-expressing intact or genetically modified TYROBP was sequenced using TruSeq Kit for RNA capture and HiSeq 2000 for the sequencing. Read mapping was done using the TopHat (Trapnell et al., 2009) RNA-seq aligner.
H.N. and L-G.B. were supported by the Hertie-Foundation and the Deutsche Forschungsgemeinschaft (FOR1336, SFB704, KFO177). H.N. is a member of the DFG-funded excellence cluster ImmunoSensation. We thank Jessica Schumacher and Rita Hass for technical assistance. We are grateful to the HBTRC for the generous gift of human postmortem brain samples. The authors are also grateful to the participants in the Religious Orders Study and the Memory and Aging Project. This work is supported by the National Institutes of Health (R01AG034504, R01 AG030146, P30 AG10161, R01 AG17917, R01 AG15819, K08 AG034290, P30 AG10161, R01 AG11101 and NS032765), the Illinois Department of Public Health and start-up funding from the University of Miami, Miller School of Medicine. JM, AAP, CZ, TX, RD, EF, SM, MN, CM and DJS are employees and shareholders of Merck & Co., Inc. JRL is an employee and shareholder of Novartis
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
SUPPLEMENTAL INFORMATION Supplemental Information includes Supplementary Data with one table, Extended Experimental Procedures with six figures and can be found with this article online at doi:xx