|Home | About | Journals | Submit | Contact Us | Français|
Combined analyses of gene networks and DNA sequence variation can provide new insights into the aetiology of common diseases. Here, we used integrated genome-wide approaches across seven rat tissues to identify gene networks and the loci underlying their regulation. We defined an interferon regulatory factor 7 (IRF7)1-driven inflammatory network (iDIN) enriched for viral response genes, which represents a molecular biomarker for macrophages and was regulated in multiple tissues by a locus on rat chromosome 15q25. At this locus, Epstein-Barr virus induced gene 2 (Ebi2 or Gpr183), which we localised to macrophages and is known to control B lymphocyte migration2,3, regulated the iDIN. The human chromosome 13q32 locus, orthologous to rat 15q25, controlled the human equivalent of iDIN, which was conserved in monocytes. For the macrophage-associated autoimmune disease type 1 diabetes (T1D) iDIN genes were more likely to associate with T1D susceptibility than randomly selected immune response genes (P = 8.85 × 10−6). The human locus controlling the iDIN, was associated with the risk of T1D at SNP rs9585056 (P = 7.0 × 10−10, odds ratio = 1.15), which was one of five SNPs in this region associated with EBI2 expression. These data implicate IRF7 network genes and their regulatory locus in the pathogenesis of T1D.
While genome-wide association studies (GWAS) have uncovered many common genetic variants associated with human diseases, the molecular mechanisms by which DNA variation affects disease risk remain poorly characterised4. To translate genetic association into biological function DNA variation has been correlated with gene expression to identify the genetic drivers of gene networks, which are coordinately regulated by transcription factors (TFs), that represent important determinants of disease aetiology5,6,7. Here we used a panel of recombinant inbred (RI) rat strains8 to study TF-driven gene networks and their regulatory loci and integrated these data with human gene expression and GWAS data to identify genes, networks and pathways for human disease (Supplementary Fig. 1).
We combined expression quantitative trait loci (eQTLs) from fat, kidney and heart8,9 with new eQTL data in aorta, skeletal muscle, adrenal and liver to create genome-wide eQTL datasets across seven rat tissues. We used a two-step procedure to integrate eQTLs and TF-target genes to identify TF-driven gene networks (Supplementary Information). In the first step, we indentified 147 TFs whose expression mapped to 587 eQTLs across seven tissues, which were mostly (>90%) under trans-regulatory genetic control, in keeping with previous studies in yeast10,11. In the second step, we tested for enrichment of transcription factor binding sites (TFBSs)12 in the putative promoter sequences of genes whose expression mapped to trans-eQTLs. Out of the 13 TF-driven gene networks identified (Supplementary Table 1) we observed the strongest TFBS enrichment for interferon regulatory transcription factor Irf7 (P < 1 × 10−6, False Discovery Rate, FDR < 5 × 10−5). Irf7 TFBSs were predicted in the promoters of 23 genes, including Irf7 itself, that all mapped to a single trans-eQTL on rat chromosome 15q25 in adrenal, kidney, heart and liver. We confirmed experimentally a subset of the predicted Irf7-targets by chromatin immunoprecipitation and quantitative PCR that established direct interaction of Irf7 with the promoters of these genes (Fig. 1a-c). Taken together, this provides evidence for a TF-driven regulatory cascade in which genetic variation on chromosome 15q25 modulates the expression of Irf7 and Irf7-target genes.
Irf7 is a master regulator of the type 1 interferon response1 and genes directly regulated by Irf7 may comprise the core components of a larger network, which we identified by genome-wide co-expression analysis of Irf7-target genes across tissues (Supplementary Information). This revealed a network of 247 genes across seven tissues, which was expanded to 305 genes in four of the seven tissues where additional gene expression data were available (FDR < 0.1%) (Supplementary Table 2). Gene ontology (GO) analysis of the network showed enrichment for specific biological processes, including “immune response” (P = 7.5 × 10−29), and “response to virus” (P = 9.6 × 10−7) (Supplementary Table 3). We designated the network the Irf7-driven inflammatory gene network (iDIN) (Fig. 1d), which was most enriched for expression in mouse bone marrow macrophages (P = 1.6 × 10−159) and human monocytes (P = 6.0 × 10−177) with high levels of expression in other immune cells, including B lymphocytes (Supplementary Fig. 2).
Whilst a core of 23 Irf7-target genes mapped to the same trans-eQTL on rat chromosome 15 the overall genetic control of the iDIN was unknown. We used sparse Bayesian regression models13 to determine the association between expression levels of iDIN genes across seven tissues with genome-wide single nucleotide polymorphisms (SNPs) and identify regulatory ‘hot-spots’14. The same rat 15q25 locus, which controlled Irf7 and its targets in trans, was associated with iDIN expression in all tissues (FDR < 1%) and showed the strongest evidence for common regulation in five out of seven tissues with increased iDIN genes expression associated with the Spontaneously Hypertensive Rat (SHR) allele (Fig. 2). The iDIN, which is highly expressed in immune cells, may represent a molecular signature of macrophages that are associated with risk of common inflammatory diseases15 and autoimmune disease type 1 diabetes (T1D)16. Hence, we characterized expression of Cd68, an established marker of macrophages17, in SHR and Brown Norway (BN) hearts and the RI strains. Cd68 mRNA levels were elevated in SHR as compared to BN heart (P = 0.01), which reflected increased numbers of macrophages (P = 2 × 10−22). In the RI strains, Cd68 was under trans-acting genetic control at the 15q25 locus that regulates the iDIN (Supplementary Fig. 3).
We then analyzed genetic variation in the RI strains using SNPs18 from the 15q25 region, which contains seven annotated protein-coding genes, and determined the expression of iDIN genes in seven inbred rat strains of known genotype that refined the locus to a 700kb region (Supplementary Fig. 4). Using the SHR genome sequence19, only Dock9, Ebi2 and Tm9sf2 exhibited DNA variation within the region, which was synonymous for Dock9, non-synonymous but not predicted to be functional for Tm9sf2 and a 5′UTR SNP for Ebi2 (Supplementary Table 4). Ebi2 was the only differentially expressed gene between parental strains within the region, was cis-regulated in heart and kidney and highly expressed in myeloid cell types (Supplementary Fig. 4 and 5). We assessed the effect of the Ebi2 5′UTR SNP by luciferase assay, the SHR allele resulted in reduced luciferase activity as compared to the BN allele (Supplementary Fig. 5).
Ebi2 (or Gpr183) encodes an orphan G protein-coupled receptor (GPCR) that controls B cell migration2,3 and is a candidate for the regulation of the iDIN at the chromosome 15q25 region. We localised Ebi2 expression to Cd68+ve macrophages within the rat heart (Supplementary Fig. 6), an observation that we confirmed and extended across tissues (pancreas, liver, kidney and heart) in the Ebi2GFP/+ mouse2 (Supplementary Fig. 7). SiRNA knockdown of Ebi2 in primary cultures of rat macrophages (Supplementary Fig. 8a) increased expression of Irf7, the central hub of the iDIN, and of iDIN genes (Supplementary Fig. 8b). This suggests that Ebi2 is a negative regulator of the innate immune response in macrophages, which would be consistent with lower Ebi2 expression in the SHR that has more macrophages than the BN rat (Supplementary Fig. 3).
To translate our findings to humans, we tested whether the iDIN was recapitulated in human immune cells using genome-wide expression data from monocytes isolated from 1,490 individuals from the Gutenberg Heart Study (GHS)20. We performed TFBS enrichment and co-expression analysis, analogous to that performed in the rat, and identified the human IRF7-driven network (Supplementary Table 5), which exhibited strong overlap with the rat iDIN (P = 9.1 × 10−20) and was most strongly annotated by the GO term “response to virus” (P = 1.9 × 10−13) (Supplementary Table 6). Using monocyte gene expression data from a distinct cohort of 758 subjects from the Cardiogenics Study (Supplementary Information), we found the same set of co-regulated IRF7-target genes (Supplementary Table 5) and significant overlap with the expanded IRF7-driven network indentified in the GHS (P = 8.3 × 10−23).
We determined whether the human chromosome 13q32 locus (spanning ~1 Mb, Supplementary Table 7), which is orthologous to the critical rat chromosome 15q25 region, was associated with expression of iDIN genes in humans. Multivariate analysis of the Cardiogenics monocyte expression and genotype data revealed that six SNPs in the 13q32 region (including rs9557217, P = 5.0 × 10−5; and rs9585056, P = 1.1 × 10−3) were associated with trans-regulated expression of IRF7 and IRF7-target genes (Supplementary Fig. 9). We did not, however, detect a signal for trans-regulation of IRF7 or IRF7-target genes at the 13q23 locus in the GHS cohort. This may reflect the differences between the monocyte selection protocols used in the two studies (Supplementary Information and data not shown).
In both the GHS and Cardiogenics cohorts, EBI2 expression in monocytes was cis-regulated at the 13q32 locus but the peak SNPs differed between the two cohorts (most associated SNPs: Cardiogenics, rs9585056, P = 2.2 × 10−8; GHS, rs9517725, P = 6.8 × 10−13) (Fig. 3). However, a formal hypothesis test21 of a common causal genetic variant was not rejected (P = 0.14). Two of the five SNPs, rs9557217 and rs9585056 contained in the model explaining EBI2 expression also exhibited a significant trans-effect on iDIN gene expression in the Cardiogenics cohort (Supplementary Fig. 9), suggesting common regulatory control by this locus on the IRF7 network and EBI2 expression.
Monocyte-derived macrophages are critical determinants of inflammatory processes important for common disease15 including autoimmune T1D22. The iDIN is expressed in macrophages, enriched for immune response genes and contained IFIH1, a well-characterised T1D susceptibility gene23,24. We evaluated the association of the human orthologues of rat iDIN genes and genes in the human iDIN (Fig. 3) with T1D (Supplementary Information). SNPs close to (<=1Mb) any iDIN genes were significantly more likely to associate with T1D in large-scale GWAS than SNPs close to genes not in the network (P = 2.4 × 10−10) (Supplementary Table 8). We also tested the iDIN association with T1D against all genes annotated by the GO term “immune response” and established an overrepresentation of T1D associated genes (P = 8.85 × 10−6), indicating that the iDIN more specifically categorizes T1D genes than the GO term “immune response”.
In a GWAS meta-analysis of T1D in 7,514 cases and 9,045 controls25, we found evidence for association of the chromosome 13q32 region at SNP rs9585056 (P = 1.3 × 10−7) that had not been reported before (Fig. 3b). We genotyped this SNP in two independent large cohorts and increased the strength of the T1D association (combined P = 7.0 × 10−10, odds ratio (95% confidence interval) = 1.15 (1.09-1.21), Supplementary Table 9). The minor C allele of SNP rs9585056 was associated with T1D risk, lower EBI2 expression in both GHS and Cardiogenics cohorts and, on average, increased expression levels of iDIN genes in the Cardiogenics cohort. Although we cannot discriminate between single and multiple causal variants, overall, these results show an overlap of association signals in the same region on human chromosome 13q32 for iDIN genes, EBI2 cis-regulation and T1D. We also noted that the EBI1 (or CCR7) and EBI3 (or IL27B) genes are also associated with T1D susceptibility: EBI1 is in the confirmed T1D region 17q21.225, and EBI3 encodes the beta subunit of the IL-27 cytokine, for which alpha-subunit gene, IL27, is in the T1D region 16p11.225, suggesting a link between EBV infection and T1D.
The immunopathology of autoimmune T1D is characterised by infiltration of the pancreas with B and T lymphocytes and macrophages16. We have shown that iDIN genes contribute to T1D risk and implicate the innate viral response pathway and macrophages in the aetiology of T1D. Genetic control points that perturb biological networks can represent important loci for disease risk6 and the new T1D susceptibility locus that we identified may regulate innate immune response genes in macrophages, as we demonstrated in the rat. Ebi2, which controls Irf71, represents a candidate for trans-regulation of the human iDIN and for T1D risk. A role for IRF7 in the pathogenesis of T1D is supported by functional studies26 and by other T1D genes, namely TLR7, TLR827, and IFIH123,24, which are regulated by or act through IRF728. Our study shows that co-expression networks across species provide functional annotation of genes in biological processes that can be used to reveal the signal of common genetic variation of small effect that is not detected by GWAS.
Genome-wide expression data in the rat were generated from seven tissues: adrenal, aorta, fat, kidney, left ventricle, liver and skeletal muscle using Affymetrix RAE 230a and RAE 230_2 chips. eQTL mapping was carried out using the genetic map of the BXH/HXB RI strains generated in a previous large scale effort by the STAR consortium18, as previously described8,9. In humans, expression data from isolated monocytes were obtained from 1,490 population-based individuals from the Gutenberg-Heart Study20 and from 758 individuals from the Cardiogenics Study. eQTL data were analyzed in conjunction with TFBS enrichment analysis using PASTAA12 to identify core gene networks centred on transcription factors. The core networks were expanded to include genes showing co-expression (FDR < 0.1%) with any of the core network genes in seven rat tissues and isolated human monocytes. Association between expression levels of the network genes and genome-wide SNPs in the rat was carried out using sparse Bayesian regression models13, and the major regulatory control points (‘hot-spots’)14 for the entire network were identified. Genes at the locus associated with the rat network were characterised by DNA sequencing, RNA-Sequencing, quantitative PCR analyses, luciferase assay and combined in situ hybridisation and immunohistochemistry. A combined network, comprised of the union or intersection of the rat and human networks, was constructed and analyzed for association with T1D by means of a stratified Wilcoxon rank test to compare SNPs genotyped in T1D GWAS25,29 close (<=1Mb) to any network gene or to those close to any gene not in the network (see www.t1dbase.org for all T1D SNP association data). SNPs across the human locus, that is orthologous to rat chromosome 15q25 controlling the network, were tested for association with T1D as described elsewhere25. Supplementary Fig. 1 provides an overview of the study design. Full Methods are provided in Supplementary Information.
We acknowledge funding from the German National Genome Research Network (NGFN-Plus ‘Genetics of Heart Failure’), the Helmholtz Association Alliance on Systems Biology (MSBN), EURATools (LSHG-CT-2005-019015), European Union FP6 (LSHM-CT-2006-037593), PHC ALLIANCE 2009 (19419PH), UK National Institute for Health Research Biomedical Research Unit (Royal Brompton and Harefield NHS Trusts, University Hospitals of Leicester NHS Trusts) and Biomedical Research Centre (Imperial College NHS Trust) awards, the British Heart Foundation, grant P301/10/0290 from the Grant Agency of the Czech Republic and 1M6837805002 from the Ministry of Education of the Czech Republic, the Fondation Leducq the Medical Research Council UK, Research Councils UK, the Juvenile Diabetes Research Foundation International, the Wellcome Trust. The research leading to these results has received funding from the European Community’s Seventh Framework Programme (FP7/2007-2013) under grant agreement N° HEALTH-F4-2010-241504 (EURATRANS).
Author Contributions S.A.C., N.H. and E.P. initiated the study. M.H., E.P., N.H. and S.A.C. participated in the conception, design and coordination of the study. H.L., Y.L., R.S., Y.A.L., S.P., C.R., K.S. and R.B. performed genetic, biochemical and functional analysis in the rat. E.E.G. and J.G.C. provided Ebi2GFP/+ mouse data. M.P. and T.J.A. contributed materials and discussion of the manuscript. M.H., E.P., C.W., D.J.S., D.C., A.B., S.R.L., L.B., M.R. and L.T. designed and applied the modelling methodology and statistical analyses. M.H., E.P. and H.S. performed expression QTL analysis in the rat. L.B. conceived and performed the Bayesian analysis. C.W., D.J.S. and D.C. performed association analyses in humans. M.H., O.H., H.R. and M.V. conceived and performed bioinformatics analyses in the rat. J.E., C.H., S.M., W.H.O., C.M.R., N.J.S., H.S., A.H.G., S.B., T.M., T.Z., S.S., A.Z., M.R., L.T. and F.C. provided the human monocyte expression data and contributed to the transcriptomic analyses in the Cardiogenics and the Gutenberg Heart Study cohorts. M.H., E.P., N.H. and S.A.C. wrote the paper with significant contributions from C.W. and J.A.T. All authors discussed the results and commented on the manuscript. M.H. and E.P. contributed equally to the paper and are listed in alphabetical order.