|Home | About | Journals | Submit | Contact Us | Français|
Genome-wide association studies (GWAS) and candidate gene studies in ulcerative colitis (UC) have identified 18 susceptibility loci. We conducted a meta-analysis of 6 UC GWAS, comprising 6,687 cases and 19,718 controls, and followed-up the top association signals in 9,628 cases and 12,917 controls. We identified 29 additional risk loci (P<5×10-8), increasing the number of UC associated loci to 47. After annotating associated regions using GRAIL, eQTL data and correlations with non-synonymous SNPs, we identified many candidate genes providing potentially important insights into disease pathogenesis, including IL1R2, IL8RA/B, IL7R, IL12B, DAP, PRDM1, JAK2, IRF5, GNA12 and LSP1. The total number of confirmed inflammatory bowel disease (IBD) risk loci is now 99, including a minimum of 28 shared association signals between Crohn’s disease (CD) and UC.
UC and CD represent the two major forms of inflammatory bowel disease (IBD: OMIM #266600), which together affect approximately 1:250 people in Europe, North America and Australasia. Clinical features, epidemiological data and genetic evidence suggest that UC and CD are related polygenic diseases. In contrast to CD, bowel inflammation in UC is limited to the colonic mucosa. While disease-related mortality is low, morbidity remains high and 10-20% of affected individuals will undergo colectomy. Though the precise etiology is unknown, the current hypothesis is a dysregulated mucosal immune response to commensal gut flora in genetically susceptible individuals1. Recent genome-wide and candidate-gene association studies have identified 18 UC susceptibility loci, including 7 that overlap with CD (e.g. IL23 pathway genes, NKX2-3 and IL10). Known UC specific loci (HNF4A, CDH1 and LAMB1) have highlighted the role of defective barrier function in disease pathogenesis2.
The 18 confirmed UC loci explain approximately 11% of UC heritability (see Online Methods). To identify additional UC susceptibility loci and further elucidate disease pathogenesis, we combined data from six GWAS using genotype imputation and meta-analysis methodology (see Online Methods). The discovery panel consisted of 6,687 cases and 19,718 controls of European descent with data available for at least 1.1 million SNPs (Supplementary Table 1). A quantile-quantile plot of the meta-analysis test statistics showed a marked excess of significant associations in the tail of the distribution (Supplementary Figure 1). Although the majority (16/18) of previously confirmed UC loci are at a genome-wide significant level (P<5×10-8), two just failed to meet this threshold in the meta-analysis – 4q273, and 22q134 (Table 1), though we still consider these to be true risk loci given the strength of association in the initial studies (P=1.35×10-10 and P=4.21×10−8 respectively). Fifty loci with P< 1×10-5 and not previously associated with UC were followed up by genotyping the most associated SNP from each locus in an independent panel of 9,628 UC cases and 12,917 population controls (see Online Methods and Supplementary Table 2). Of these, 28 loci had evidence of association (P<0.05) in the follow-up panel and attained genome-wide significance (P<5×10-8) in the combined analysis of meta-analysis and follow-up cohorts (Table 2 and Supplementary Table 3). In addition, although the locus on 1q32 failed follow-up genotyping (rs7554511) it had been previously tested for association to UC in an independent cohort (rs11584383: P = 1.2×10-5)5. This alternative tag SNP achieves genome-wide significancein our current meta-analysis (P=3.7×10-11) and therefore we consider this a confirmed UC locus, bringing the total number of new UC loci to 29. It should be noted that 12 of the 29 loci had documented nominal evidence of association (5×10-8<P<0.05) to UC in previous reports (1p362, 1q326, 5q336, 6p215, 7q327, 9p245,8, 9q345,9, 10p116, 11q235, 13q128, 13q132 and 20q1310). We also tested the 28 loci with follow-up genotype data for association with two clinically relevant disease sub-phenotypes (maximum disease extent and need for colectomy for medically refractory disease) but no significant associations were seen following correction for multiple testing (P<5.2×10-4) (Supplementary Table 4). In summary, there are 47 confirmed UC susceptibility loci, 18 from previous studies and 29 from the current study.
As a first step towards obtaining biological insight from the identification of these 47 loci, we examined the gene content of the associated regions (Supplementary Figure 2). Although three regions contained a single gene (5p15:DAP, 6q21:PRDM1, 10q24:NKX2-3), most (35/47) contain multiple genes and nine are not believed to contain any gene (Table 1). We attempted to identify plausible candidate genes by (a) using a literature-mining tool (GRAIL) to identify non-random, evidence-based links between genes, (b) searching an existing eQTL database11 for correlations with our most associated SNPs (Supplementary Table 5), (c) using 1000 genomes data to identify non-synonymous SNPs in linkage disequilibrium (LD) (r2>0.5) with the most associated SNP in the locus (Supplementary Table 6), and (d) determining the gene in closest physical proximity to the most associated SNP (see Online Methods). These approaches (results summarized in Table 1, Table 2 and Supplementary Table 7) consistently identified a single candidate gene in six of the associated regions (2q11:IL1R2, 5p15:IL7R, 7p22:GNA12, 10p11:CCNY, 1p31:IL23R, 16q22:ZFP90), potentially prioritizing which genes to follow up in future genetic and functional studies. Noteworthy candidate genes are described in Box 1. Follow-up genotyping in even larger independent panels of cases and controls from a range of ethnicities may be needed to identify the genes containing causal variants.
TNFRSF14 / MMEL1 (1p36). TNFRSF14 encodes a member of the TNF receptor superfamily. In a T cell transfer model of colitis, TNFRSF14 expression by innate immune cells has an important role in preventing intestinal inflammation22. MMEL1 encodes membrane metalloendopeptidase-like 1. This locus is associated with susceptibility to celiac disease and primary biliary cirrhosis; a nsSNP in MMEL1 was nominally associated with multiple sclerosis.
TNFRSF9 (1p36): Tumour necrosis factor receptor superfamily member 9 is involved as a co-stimulator in the regulation of peripheral T cell activation, with enhanced proliferation and IL2 secretion. It is expressed by dendritic cells, granulocytes and endothelial cells at sites of inflammation. SCID mice transferred with naive CD4+ T cells from TNFRFSF9-deficient mice develop colitis of equal intensity as SCID mice transferred with wild type naïve T cells, but with amodified cytokine response23.
IL1R2 (2q11): Interleukin 1 receptor, type II binds IL1a, IL1b and IL1R1, inhibiting the activity of these ligands. Two alternative splice transcripts of IL1R2 have been reported. This protein serves to antagonise the action of IL1a and IL1b, pleiotropic cytokines with various roles in inflammatory processes. IL1b production by lamina propria macrophages is increased in patients with UC24.
This locus is immediately adjacent to a CD-associated locus containing IL18RAP, ILR1 and other genes. It is unclear at present whether the CD-associated and UC-associated SNPs in these regions tag two separate loci or one locus. The lead CD SNP has a P=0.001 in our UC meta-analysis. There is a large recombination hotspot between IL1R2 (UC) and IL1R1 (CD).
IL8RA / IL8RB (2q35): IL8RA and IL8RB encode two receptors for interleukin-8, a powerful neutrophil chemotactic factor. IL8RA expression, limited to a subpopulation of lamina propria macrophages and germinal centre lymphocytes in the healthy colon, is increased in macrophages, lymphocytes and epithelium in UC25. IL8RB expression is more limited and not upregulated in UC. IL8 expression is profoundly increased in colonic tissue from UC patients compared with controls; this increase is driven by inflammation26.
DAP (5p15) encodes death-associated protein. The DAPs are a heterogenous group of polypeptides isolated in a screen for elements involved in the IFNγ – induced apoptosis of HeLa cells. DAP negatively regulates autophagy and is a substrate of mTOR13.
IL7R (5p13) encodes the receptor for interleukin-7. IL7 is a key regulator of naïve and memory T cell survival, specifically the transition from effector to memory T cells27. T cells expressing high levels of IL7R are seen in human and murine colitis; selective depletion of these cells ameliorates established colitis 28. IL7R is a confirmed multiple sclerosis susceptibility gene29. The gene may have undergone extensive evolutionary selective pressure by intestinal helminths30.
PRDM1 (6q21) encodes PR domain containing 1, with ZNF domain (synonym BLIMP1), the master transcriptional regulator of plasma cells and a transcriptional repressor of the IFN-β promoter. It plays important roles in the proliferation, survival and differentiation of B and T lymphocytes.
GNA12 (7p22) encodes guanine nucleotide binding protein (G protein) alpha 12, a membrane bound GTPase that plays an important role in tight junction assembly in epithelial cells, through interactions with ZO-1 and Src20.
IRF5 (7q32) encoding interferon regulatory factor 5, is a confirmed susceptibility gene for rheumatoid arthritis, SLE and primary biliary cirrhosis. This transcription factor regulates activity of type I interferons and induces cytokines including IL-6, IL-12 and TNFα, via TLR signaling. In response to mycobacterium tuberculosis infection of macrophages, Type I interferon expression is dependent on a pathway including IRF5, NOD2 and RIP231.
LSP1 (11q15): Lymphocyte-specific protein-1 is expressed by lymphocytes and macrophages, and also in endothelium wherein it is critical for normal neutrophil transmigration32.
Additional bioinformatic analyses were also performed on the entire set of genes in the associated regions to search for functional commonalities across this large number of loci (see Online Methods). Specifically, using a gene set enrichment approach the UC loci are seen to have more genes associated with cytokines and cytokine receptors (including IFNγ, several interleukins, five TNF and TNFR superfamily members), key regulators of cytokine-mediated signaling pathways, innate and adaptive immune response, macrophage activation and regulation of apoptosis than would be expected by chance (Supplementary Table 8 and Supplementary Figure 3). Enrichment analysis of the subset of candidate loci with no known association to other inflammatory diseases showed significant over-representation of gene sets associated with MAP kinase signaling, actin binding, calcium-dependent processes, fatty acid and lipid metabolism (Supplementary Table 8 and Supplementary Figure 3).
The 5p15 locus contains a single gene, DAP (death-associated protein), with the most associated SNP in this region having a strong eQTL effect on DAP expression (P=2.59×10-12)11. DAP kinase expression has been shown to increase with inflammation in UC12, and DAP itself has recently been identified as a novel substrate of mTOR (mammalian target of rapamycin)13 and as a negative regulator of autophagy. While autophagic processes have previously been implicated in CD due to associations with ATG16L1 and IRGM14, this association with DAP suggests a possible link between autophagy and UC.
Association to loci containing PRDM1, IRF5 and NKX2-3 suggests an important role for transcriptional regulation in UC pathogenesis. A key example is BLIMP-1, encoded by the PRDM1 gene, whose most important function is in B cells, as the master transcriptional regulator of plasma cells15. It also functions in T cells to attenuate IL-2 production upon antigen stimulation16, and topromote the development of short-lived effector cells and regulate clonal exhaustion in both CD4 and CD8 cells17. It is noteworthy that the 11q24 celiac disease susceptibility locus containing ETS1, a transcription factor essential for T-bet induced production of IFNγ and the development of colitis in animal models, just fails to reach genome-wide significance in our study (P=1.22×10-7, Supplementary Table 3b)18,19.
Identification of GNA12 as the most likely candidate at the 7p22 locus suggests a role for intestinal barrier function as this gene is implicated in tight junction assembly in epithelial cells20. Barrier integrity appears to be a key pathway in UC pathogenesis given previous associations to loci containing HNF4A, CDH1 and LAMB12,5.
Given the phenotypic overlap between UC and CD, we examined the evidence for association at all 47 UC loci in our recently completed CD GWAS meta-analysis comprising 6,333 cases and 15,056 controls14 and, conversely, for evidence of association at all confirmed CD loci in our UC meta-analysis (Table 3 and Supplementary Table 9). We find that, among the 99 confirmed IBD loci meeting genome-wide significance (P<5×10-8) either in UC and/or CD, 28 independent index SNPs have P<1×10-4 in both scans. Interestingly, all index SNPs meeting these criteria showed the same direction of effect in both diseases, thus pointing to a minimum of 28 shared association signals between UC and CD. Multiple genes involved in the IL23 signaling pathway are included in this overlapping SNP list, specifically IL23R, JAK2, STAT3, IL12B (p40), and PTPN2. The significance of these findings is underlined by the central role played by IL23 in the induction of IL17 by Th17 lymphocytes, its established role in other autoimmune disorders, and the intense interest in therapeutic manipulation of the IL23-IL23R interaction through blockade of the p40 or p19 IL23 subunits.
Loci not meeting these inclusion criteria cannot be formally discounted as shared loci, indeed many of the confirmed UC/CD loci with nominal association (1×10-4<P<0.05) to the other disease may be shared. Among the confirmed UC loci with no evidence (P>0.05) of association to CD are the three containing candidate genes that play a role in intestinal barrier function (GNA12, HNF4A, and LAMB1).
In addition to loci shared with CD, 19 of the 47 UC risk loci are also associated with other immune-mediated diseases (Table 1 and Table 2). In particular, these “shared loci” are enriched for genes involved in T-cell differentiation, specifically in the differentiation of TH1 and TH17 cells (e.g. loci encoding IL23R, IL21, IL10, IL7R, IFNG). Dysregulated auto-antigen specific TH1 responses are believed to be involved in organ-specific autoimmune diseases, and TH17 cells are increasingly recognized to contribute to host defense and induction of autoimmunity and tissue inflammation21. Another shared pathway between UC and other immune mediated diseases involves TNF-signaling (TNFRSF9, TNFRSF14, TNFSF15) with widespread immunological effects including NF-κB activation, a known key component of the inflammatory response in IBD.
The current study has more than doubled the number of confirmed UC susceptibility loci and we estimate that 16% of UC heritability is explained by these loci (see Online Methods). We have identified potentially causal genes at several loci but confirmation of causality awaits detailed fine-mapping, expression and functional studies. Dense fine-mapping and large-scale re-sequencing studies are underway with the goal of identifying the causal variation within many of these loci.
In memoriam to Marc Lémann, who dedicated his life to his patients but died too soon.
We thank all subjects who contributed samples, and physicians and nursing staff who helped with recruitment globally. This study was supported by the German Ministry of Education and Research through the National Genome Research Network, the popgen biobank and infrastructure support through the DFG cluster of excellence ‘Inflammation at Interfaces. Italian case collections were supported by the Italian Group for IBD and the Italian Society for Paediatric Gastroenterology, Hepatology and Nutrition. We acknowledge funding provided by Royal Brisbane and Women’s Hospital Foundation; University of Queensland (Ferguson Fellowship); National Health and Medical Research Council, Australia and by the European Community (5th PCRDT). UK case collections were supported by the National Association for Colitis and Crohn’s disease, Wellcome Trust, Medical Research Council UK and Peninsular College of Medicine and Dentistry, Exeter. Activities in Sweden were supported by the Swedish Society of Medicine, the Bengt Ihre Foundation, the Karolinska Institutet, the Swedish National Program for IBD Genetics, the Swedish Organization for IBD, the Swedish Medical Research Council, the Soderbergh Foundation and the Swedish Cancer Foundation. Support for genotyping and genetic data analysis was provided by the Agency for Science Technology and Research (A*STAR), Singapore. We are grateful to the funders and investigators of the Epidemiological Investigation of Rheumatoid Arthritis for providing genotype data from healthy Swedish individuals.
The Wellcome Trust Case Control Consortium 2 project was supported by Wellcome Trust grant 083948/Z/07/Z. We also acknowledge the NIHR Biomedical Research Centre awards to Guy’s & St.Thomas’ NHS Trust/King’s College London and to Addenbrooke’s Hospital/University of Cambridge School of Clinical Medicine/University of Manchester and Central Manchester Foundation Trust. The NIDDK IBD Genetics Consortium is funded by the following grants: DK062431 (SRB), DK062422 (JHC), DK062420 (RHD), DK062432 (JDR), DK062423 (MSS), DK062413(DPBM), DK076984 (MJD), and DK084554 (MJD and DPBM), and DK062429 (JHC). JHC is also funded by the Crohn’s and Colitis Foundation of America; SLG by DK069513 and Primary Children’s Medical Center Foundation, and JDR by NIH/NIDDK grant DK064869. Cedars Sinai supported by NCRR grant M01-RR00425; NIH/NIDDK grant P01-DK046763; DK 063491; and Cedars-Sinai Medical Center Inflammatory Bowel Disease Research Funds. RW is supported by a clinical fellow grant (90700281) from the Netherlands Organization for Scientific Research; EL, DF and SV are senior clinical investigators for the Funds for Scientific Research (FWO/FNRS) Belgium. SB was supported by Deutsche Forschungsgemeinschaft (DFG BR 1912/5-1) and Else Kröner-Fresenius-Stiftung (P50/05/EKMS05/62). MC was supported by the Programme Hospitalier de Recherche Clinique. CAA is supported by Wellcome Trust grant WT091745/Z/10/Z. JCB is supported by Wellcome Trust grant WT089120/Z/09/Z. RKW is supported by a clinical fellowship grant (90.700.281) from the Netherlands Organization for Scientific Research (NWO). CW is supported by grants from the Celiac Disease Consortium (BSIK03009) and the Netherlands Organization for Scientific Research (NWO, VICI grant 918.66.620). LHvdB acknowledges funding from the Prinses Beatrix Fonds, the Adessium foundation and the Amyotrophic Lateral Sclerosis Association. LF received a Horizon Breakthrough grant from the Netherlands Genomics Initiative (93519031) and a VENI grant from NWO (ZonMW grant 916.10.135). RJX and AN are funded by DK83756, AI062773, DK043351 and the Helmsley Foundation.
Replication genotyping was supported by unrestricted grants from Abbott Laboratories Ltd, Giuliani SpA, Shire PLC and Ferring Pharmaceuticals. We thank the 1958 British Birth Cohort and Banco Nacional deADN, Salamanca, Spain who supplied control DNA samples. The IBSEN study group and the Norwegian Bone Marrow Donor Registry are acknowledged for contributing the Norwegian patient and control populations. The CHS research reported in this article was supported by contract numbers N01-HC-85079 through N01-HC- 85086, N01-HC-35129, N01 HC-15103, N01 HC-55222, N01-HC-75150, N01-HC- 45133, grant numbers U01 HL080295 and R01 HL087652 from the National Heart, Lung, and Blood Institute, with additional contribution from the National Institute of Neurological Disorders and Stroke. A full list of principal CHS investigators and institutions can be found at http://www.chs-nhlbi.org/pi.htm. We thank the members of the Quebec IBD Genetic Consortium, in particular A. Bitton, G. Aumais, E.J. Bernard, A. Cohen, C. Deslandres, R. Lahaie, D. Langelier and P. Paré. Other significant contributors: K. Hanigan, N. Huang, P. Webb, D. Whiteman, A. Rutherford, R. Gwilliam, J. Ghori, D Strachan, W. McCardle, W. Ouwehand, M. Newsky, S. Ehlers, I. Pauselius, K. Holm, C. Sina, M. Regueiro, A. Andriulli and M.C. Renda.
Contribution of authors CWL, AF, KDT, JCL, MI, AL, LA, LB, RNB, MB, TMB, SB, CB, J-FC, LAD, MdV, MD, CE, RSNF, TF, DF, MG, JG, NLG, SLG, TH, NKH, J-PH, GJ, DL, IL, ML, AL, CLi, EL, DPM, MM, CM, AN, WN, RAO, LP, OP, LPB, JP, AP, NJP, DDP, RRo, RRu, PR, JS, MS, PS, FS, YS, MS, AHS, SRT, LHvdB, MV, HV, TW, CW, DCW, H-JW, CYP, VA, LT, MG, NPA, THK, LK, JS, JCM, SK, MSS, JH, JIR, CGM, AMG, RG, TA, SRB, MC, JS, JHC, SS, MP, VA, HH, GRS, RHD, SV, RKW and JDR established DNA collections, recruited patients or assembled phenotypic data; AF, MD’A, PG, CLa, RS, SB, CLi, DPM, GWM, LS, ZZZ, MC, RHD, and JDR conducted or supervised laboratory work; CAA, GB, DE, JABF, LF, KIM, AN, RAO, RJX, MJD, JCB, RKW, and JDR performed or supervised statistical analyses; CAA, GB, CWL, GRS, RHD, SV, RKW and JDR drafted the manuscript. All authors read and approved the final manuscript before submission.
All authors declare no financial interest.