As hypothesized, we validated a scaled-up Para-ICA approach to reveal novel interactive genes and pathways for LOAD, thus highlighting one of the primary advantages of Para-ICA which is the use of modest sample sizes compared to conventional GWAS analyses to effectively capture genotype-phenotype relationships. Dominant loading coefficients were contained in all major regions affected by LOAD pathology in the single structural component significantly associated with four different SNP/genetic networks. Seven other structural components were unassociated with other gene components. The genetic components identified included SNPs from APOE4 plus multiple other risk genes (and putatively protective SNPs, e.g. APOE2) either previously identified in LOAD risk (e.g. ATF7 in G3 (Lin et al., 2006
)) or involved in one or more biological processes thought to contribute to LOAD pathology. summarizes the involvement of these four genetic risk networks on a LOAD physiologic pathway diagram.
A summary of different genetic components (circled in red) identified in the current study projected onto an Alzheimer’s disease functional interaction pathway (modified from Sleegers K et al., TIGS 2009)
The most significant association was between G1-S1. This genetic network had significant loading contributions from a total of 169 different genes (332 SNPs) and correlated negatively with brain network S1, implying increased genetic load is related to decreased brain volume/thickness within the network. This association was notable as the S1 had high loadings from APOE4 (in the top 10 gene loadings) and S1 included regions known to be affected early and severely in LOAD, including entorhinal, middle temporal and prefrontal cortices and hippocampus. More importantly, this genetic network had high loadings from several other genes (SLC9A7/NHE7, ZNF673, SHROOM2) previously unidentified in LOAD pathology. Given that they were part of the same independent genetic component as APOE, this finding both confirms APOE’s established role as an important LOAD risk gene and suggests that these additional SNPs may interact with APOE to influence disease risk, supporting APOE’s role as a LOAD risk factor rather than a direct cause (Guerreiro et al., 2010
). The protein encoded by SLC9A7 mediates Na+/H+ exchange across cell surface plasma membranes (Kagami et al., 2008
) cycling between the cell surface and intracellular trans-Golgi network and recycling endosomes, which are vital to APP processing (Marks and Berg, 2010
). SLC9A7 co-localizes with actin, implicated in tau formation (Gallo, 2007
). LOAD lymphoblasts show abnormalities modulated by sodium/hydrogen exchanger blockers (Urcelay et al., 2001
). Overall, this genetic network was enriched with genes dominant in cell signaling pathways. Other strongly contributing genes are involved in lipid transport and tau formation (via actin/myosin binding). ZNF673 is associated with X-linked mental retardation, (Lugtenberg et al., 2006
; Ramaswamy et al., 2010
) and is close (~0.2 MB) to SCL9A7 on Xp11.3.
The second most significant association in genotype/phenotype correlation was G2-S1. This correlation was positive. G2 comprised 182 unique genes (377 SNPs). Some top-ranked genes from this component overlapped with those from G1, including ZNF673 and SLC9A7. These genes had a significant differential distribution among diagnostic groups, suggesting a role of actin localization and transcriptional regulation in LOAD. Other top genes from this network involved in important AD-related processes included the complement system, involved in amyloid-beta formation and inflammatory damage (van Es and van den Berg, 2009
Network G4 correlated positively with S1. G4 contained several genes associated with risk for non-neurologic disorders, including diabetes and cardiovascular disease, both LOAD risk factors (Profenno et al., 2010
). Top genes from this network, previously unidentified in the context of LOAD, belonged to the complement factor/inhibition pathway related to amyloid-beta clearance (35) or are associated with major histocompatibility class III. Additionally, AKAP9, a top 10 gene in this network, maintains neuronal Golgi integrity and is involved in LOAD pathogenesis (Stieber et al., 1996
). Regarding association analysis, no top 10 gene from this network was significantly differentially distributed in the disease groups, suggesting that G4 comprises multiple SNPs of low effect acting together through diverse biological risk pathways, especially inflammation, (see Eikelenboom et al. 2006) to significantly affect LOAD-related neuropathology.
The final genotype-phenotype association was a negative correlation between G3 and S1. G3 included ATP5G2, a subunit of mitochondrial ATP-synthase, which was over-represented in the disease group. Mitochondrial ATP-synthase in entorhinal cortex is a target of oxidative stress in LOAD (Terni et al., 2010
) and part of LOAD apoptosis pathways. Several other G3 genes included dominant signaling from CNTN5, recently associated with multiple AD MRI characteristics (Biffi et al., 2010
), CEP57, a microtubular/centrosomal localizer (Meunier et al., 2009
), MTMR2, an endosomal regulator (Lee et al., 2010
), and ATF7, associated with LOAD in Lin et al. (Lin et al., 2006
). The loading of previously identified LOAD genes and associated pathobiological pathways further supports the relevance of this genetic network.
Analyzing significant genes from all four components using DAVID and visualizing related processes on KEGG pathways revealed that genes grouped in multiple LOAD-relevant biological processes (see ). Additional prominent processes not shown in figure included cellular communication, cardiovascular diseases, signal transduction, calcium signaling, cell adhesion and neuronal developmental processes (e.g. axon guidance). Many such processes are implicated in LOAD pathology (e.g. neuronal calcium signaling (Kostiuk et al., 2010
; LaFerla, 2002
; Mattson and Chan, 2003
)). Semaphorin 3A, an axon-guiding membrane protein, accumulates in hippocampus in AD (Koncina et al., 2007
Major themes deriving from the top 32 Z
score-defined genes in the 4 SNP components suggest several major pathophysiological LOAD pathways, especially when such genes co-occurred within a component. From G1, APOE may relate to LOAD risk through pathways not directly linked to amyloid-beta, including actin-related mechanisms. Actin cytoskeletal changes as a path to tau formation (Gallo, 2007
) are implicated across all components by SCLC987/NHE7 (Kagami et al., 2008
; Ohgaki et al., 2008
), SHROOM2 and COBL (Dominguez, 2009
) and microtubule-related genes including MTMR2, CEP57 and CTNND2 (Bamburg and Bloom, 2009
; Meunier et al., 2009
). Three such genes were present in component 1. Immune function, especially the complement system, related to amyloid-beta clearance (Guerreiro et al., 2010
; Kolev et al., 2009
) and expressed in cerebrovascular smooth muscle (Walker et al., 2008
), is suggested by ATF7, CFB, C2, SKIV2L, C6orf10 and C6orf15 (Li et al., 2006
; Veerhuis, 2010
). These genes support the known role of the complement system in LOAD pathogenesis (van Es and van den Berg, 2009
), while adding new gene candidates, e.g. C2. Complement is present in dystrophic LOAD neurites, involved in immune response and linked to synaptic pruning (Hollingworth et al., 2010
). Five immune related/complement genes are present in G4.
CTNND2/Delta Catenin/NPRAP is associated with GSK3-beta, hence BAP and tau (Bareiss et al., 2010
). CNTN5 encodes contactin5; other contactins participate in LOAD risk, (Biffi et al., 2010
; Osterfield et al., 2008
). The prominence of SCLC987/NHE7(and MTMR2) suggests the importance of the trans-Golgi network and recycling endosome (Lee et al., 2010
). Endosomal processing of APP involving SorLA is of importance in LOAD (Lin et al., 2005
; Marks and Berg, 2010
; Ohgaki et al., 2008
). VPS proteins are related to this process (He et al., 2005
; Marks and Berg, 2010
), although VPS13C has yet to be implicated. VPS13C is associated with maintenance of plasma glucose levels (Saxena et al., 2010
); the related VPS26 is linked with BACE/memapsin2 (He et al., 2005
). CL44A4 is involved in choline uptake (Jurgensen and Ferreira, 2010
). MTMR2 has relevance to excitatory synapses (Lee et al., 2010
). ZKSCAN3/ZNF263 is associated with vascular endothelial growth factor (Yang et al., 2008
The above data suggest involvement of multiple genes influencing varied, complex pathways that might interact mutually to contribute to LOAD. Output from Para-ICA lends itself readily to functional pathway analysis and ultimately systems biology. We also identified novel putative LOAD risk genes, confirmed via testing allelic frequency distributions among disgnostic groups in standard case-control association analyses. It is notable that while none of these genes survived a standard GWAS study, they have high impact when their effect is evaluated in the context of other SNPs. In addition, the SNP components detected several genes previously unknown in the context of LOAD risk, having high Z scores, exceeding those for APOE. Several of these (e.g. SLC9A7, ZNF673, VPS13) were: (a) identified by multiple (up to 17) SNPs, (b) mediate processes plausibly associated with LOAD risk from pathway analyses and prior publications, (c) had SNPs differentially distributed among diagnostic groups and (d) are prominently expressed in brain. These results suggest validity of these novel loci as candidate LOAD risk genes.
Examining loading coefficients of the gene and structural networks revealed a stepped response pattern (see Fig S2
), with MCI values falling between those of healthy control and AD, except for in G3, where they were elevated in MCI compared to AD, suggesting that this gene component may act to either protect against or hasten regional brain deterioration in MCI to influence progression rate to AD.
Our study has limitations. Although the Para-ICA method is data driven, we restricted the genetic dataset to a disease-related subset. This focused analysis might fail to uncover genes affecting LOAD pathology via other interactive pathways that may not straightforwardly show group differences. However, since we employed a liberal statistical threshold to limit the genetic dataset to disease-related genes, we were able to include numerous SNPs discarded by conventional univariate studies. Our AD/MCI-focused gene set analysis may not have detected other genetic associations to brain structure. The analysis was carried out only in European-Americans, by far the most numerous ethnicity in the dataset. Future studies could include larger mixed populations. Also, since para-ICA identified multivariate relationships at the gene network level (comprised of linear combinations of SNPs), the directionality and effect magnitude of individual SNPs is not immediately transparent. Our supplementary association analysis to derive the top SNPs might be slightly biased, as they were already pre-selected at a liberal cut-off to be included in the multivariate analysis. Given these limitations and the novelty of our study, our results require further validation and replication in more diverse and larger independent datasets.
In conclusion, we met our major study goals by 1) confirming the feasibility of a hypothesis-blind, multivariate approach to corroborate LOAD genes associated with known pathologic mechanisms and to discover new putative disease-relevant genes that interact but fail individually to reach genome-wide significance. These data thus extend existing GWAS and hypothesis-driven analyses on the same ADNI data set (Biffi et al., 2010
; Saykin et al., 2010
; Shen et al., 2010
). 2) The Para-ICA approach identifies genes in relatively modest sized samples that are plausibly linked collectively in known physiologic pathways, perhaps epistatically and suggests itself as a novel method for exploring other large-scale data sets involving gene and endophenotype information such as BSNIP or COGS (Calkins et al., 2007; Thaker, 2008), in psychotic disorders where the neuropathology and genetic basis are less well-defined than LOAD. Finally, 3) we identified plausible new biological pathways associated with AD neuropathology. Possible therapies resulting from our findings might include agents targeted to the complement and/or immune systems.