We have used a spatiotemporal AP-MS approach to obtain the first compendium of in vivo full-length Htt interacting proteins in the mammalian brain, with the identification of 747 candidate proteins that complex with fl-Htt in vivo, creating one of the largest in vivo proteomic interactome datasets to date and directly validating more than 100 previously identified ex vivo interactors shown to associate with small N-terminal Htt fragments. We have also provided information on the context (age or brain regions) in which these proteins associate with fl-Htt. Moreover, we were able unbiasedly rank the interacting proteins, based on their correlation strength with Htt, and to construct a WGCNA network that describes this interactome. Proteins in several WGCNA network modules are highly correlated with Htt itself, and appear to reflect distinct biological contexts in their interactions with Htt. Finally, we were able to validate 18 Red Module proteins as in vivo physical interactors or genetic modifiers in an HD fly model. In summary, our study provides the first comprehensive survey of proteins complexed with fl-Htt in distinct brain regions at different ages, and demonstrates a conceptually novel but simple approach to building unbiased in vivo interactomes based on AP-MS data from tissues as complex as the mammalian brain.
This study highlights core HD-relevant molecules and pathways via integration of our spatiotemporal fl-Htt interactome with previously generated Htt ex vivo
interactome and cell- or invertebrate-based genetic modifier datasets, many of which are archived in IPA ‘Huntington’s Disease Signaling’ pathway. We show that 139 previously identified ex vivo
Htt interactors (e.g.
, Y2H) also complex with fl-Htt in the mammalian brain (Goehler et al., 2004
; Kaltenbach et al., 2007
). Moreover, comparison of our interactome with datasets derived from genetic modifier screens in Drosophila
and C. elegans
models of HD may also help identify proteins that can interact with mHtt in the mammalian brain and possibly modify its toxicity, and could be prioritized for further validation in mammalian models of HD. For example, comparison of our dataset with data obtained from genetic screens in yeast, C. elegans
, and fly models of HD (Nollen et al., 2004
; Wang et al., 2009
; Zhang et al., 2010
; Silva et al., 2011
) reveal several Red Module (CCTs, Hsp90s, 14-3-3s, and Vcp; ) and Pink Module (Uqcrc2) proteins in common, which could possibly represent evolutionarily conserved modifiers of mHtt-induced toxicity. Therefore, their disease-modifying role should be fully explored in HD mammalian models.
A key motivation for this work was to obtain an unbiased global view of complex biological function or disease processes related to fl-Htt protein in the intact brain, and to formulate novel, testable hypotheses. The first crucial insight obtained from our WGCNA analyses is that distinct Htt-correlated modules represent proteins preferentially complexed with Htt in specific sample conditions that reflect key biological processes: cortical Htt IP samples (Red, Blue, Yellow and Green Modules), cerebellar Htt IP samples (Pink Module), and 12-month but not 2-month cortical samples (Cyan Module). The second important insight is that each of the six significant modules provides critical aspects of known Htt and HD biology. All are significantly enriched with ‘Huntington’s Disease Signaling’ in IPA (). Knowing the architecture and Htt-relevance of each network module can seed novel hypotheses based on important hubs and/or molecular or pathogenic processes defined by the module, e.g., Cntn1 and Vcp in Red Module mediating mHtt toxicity; Rad23b in Red Module implicating specific DNA repair and ubiquitin/proteasome pathway in Htt biology; Sirt2, Cox2 and Usp9x in the Cyan Module in age-dependent pathogenesis; and Itpr1 and Grid2 in cerebellar neuroprotection in HD. Testing such hypotheses will constitute a crucial next step towards unraveling the complex biology of Htt in healthy and diseased brains, but also further deciphering the biological significance of the in vivo Htt protein network.
One potentially exciting area of data integration would be to combine proteomic interactome data with large-scale genetic modifier/gene expression profiling studies from HD patients. The increasing capacity of DNA sequencing provides an unprecedented opportunity for such large-scale studies using patient samples, and our fl-Htt brain interactome may provide converging information on candidates that may have both a genetic and proteomic link to Htt. Thus, our study lends strong support to a systems biology strategy of vertically integrating large genetic, genomic, and interactome datasets (Geschwind and Konopka, 2009) derived from HD models of different organismal complexity to unravel the conserved mechanism related to Htt biology and HD pathogenesis.
Our study supports the view that the majority of Htt interacting proteins are relatively stable across brain tissue and age (), while a portion of the Htt interactome is quite dynamic. The latter group of proteins, particularly those that consistently complex with mHtt in a brain-regional-specific or age-specific manner (), could be interesting candidates to study for their contribution to selective neuronal or regional vulnerability and age-dependent pathogenesis in HD. Our studies also raised the intriguing possibility that age-dependent changes in the normal brain proteome (e.g.
Sirt2) may alter Htt interactions, which could in turn contribute to the presently unexplained role of aging in disease pathogenesis (Maxwell et al., 2011
A major advance in this study is the use of a systems biology approach to construct in vivo
protein interaction networks exclusively using proteomic interactome datasets generated from complex tissue, such as the mammalian brain. We applied, for the first time, WGCNA to analyze all the peptide count information for an entire group of Htt complexed proteins in our spatiotemporal AP-MS dataset. WGCNA provides an unbiased systems level organization of gene expression modules in both normal and diseased brains (Voineagu et al., 2011
) and has been demonstrated to be among the most powerful methods for global network construction (Allen et al., 2012
). Several lines of evidence support the validity and value of WGCNA analyses of our in vivo
Htt interactome dataset. We were able to show that the pairwise correlation measure leads to a meaningful ranking of Htt-related proteins with respect to the external annotated knowledge of HD related proteins (Huntington’s Disease Signaling in IPA; ). WGCNA identified six significant Htt-correlated modules with distinct tissue- or age-specific over-representation, and significant enrichment of distinct biological function previously implicated in Htt biology (), effectively providing an in silico
dissection of the molecular processes related to fl-Htt biology.
Several experimental factors were instrumental to the construction of WGCNA networks based on our AP-MS dataset. First, the relative level of bait protein (fl-Htt) brought down by IP is markedly, but reproducibly, variable across all samples. Such variation is due to our experimental design of using samples from distinct brain regions, ages, genotypes, and no-antibody controls, and also from the use of an anti-Htt antibody that preferentially binds to human over murine Htt. The quantitative difference in the amount of Htt precipitated in each sample results in a similar quantitative variation for those proteins that were tightly associated with Htt (i.e
, highly correlated with Htt), while background proteins (false positives) in the sample are less likely to vary in a similar manner as Htt. Hence, rather than being weakened by experimental variance, WGCNA was able to extract the quantitative correlation relationships among the proteins identified in our study. The second important factor for WGCNA analyses was the large-scale and multi-dimensional nature (e.g.
, brain region, age, and genotype) of our study. We estimated that one would need at least 24 independent AP-MS experiments (at least one biological replicates per sample condition), with systematic changes in the sample conditions to create differential pull-down of the bait protein and its complexes, in order to construct a robust WGCNA protein interaction network. One caveat of the current study is our use of MS unique tryptic peptide counts as a semi-quantitative readout of relative protein abundance. Such limitation could have been resolved by using stable isotope labeling in intact animals for a quantitative AP-MS study (Krüger et al., 2008
Finally, our analysis provides a central molecular network, the Red Module, which is likely to contain proteins crucial to Htt biology and may constitute novel molecular targets to study for HD pathogenesis and therapeutics. The Red Module has Htt as its member, and is highly enriched with Htt interactors and genetic modifiers (). We were able to validate 6 Red Module proteins as novel in vivo
Htt interactors by co-IP () and 12 as novel modifiers of Htt-induced neuronal dysfunction in a fly model (; Supplemental Figure S4A–J). Moreover, Red Module proteins are targets for small molecules that are in HD clinical trials (i.e.
, creatine targeting Ckb; Hersch et al., 2006
), or show effectiveness in preclinical studies in HD or other polyglutamine disorders in mice (Waza et al., 2005
; Masuda et al., 2008
). Considering several other proteins in this module can also be targeted by small molecules (), it would be interesting to explore whether pharmacological targeting of these proteins could be therapeutic in HD preclinical models.
In conclusion, we have constructed the first compendium of in vivo fl-Htt interacting proteins in distinct brain regions and ages, thereby providing a valuable resource for further exploration of the normal function of Htt in several disease-relevant biological context, and for identification of novel molecular targets critical to selective pathogenesis for HD and to develop new therapy. Moreover, we have demonstrated an innovative approach utilizing WGCNA to analyze multidimensional AP-MS datasets to produce in vivo protein networks for a given bait protein, and validated key proteins in the network using a HD fly model. This powerful application of systems biology to proteomics can be readily applied to decipher in vivo protein networks for other complex biological systems or disease processes in tissues as complex as the mammalian brain.