|Home | About | Journals | Submit | Contact Us | Français|
We report the results of a meta-analysis of genome-wide association scans for multiple sclerosis (MS) susceptibility that includes 2,624 subjects with MS and 7,220 control subjects. Replication in an independent set of 2,215 subjects with MS and 2,116 control subjects validates new MS susceptibility loci at TNFRSF1A (combined P = 1.59 × 10−11), IRF8 (P = 3.73 × 10−9) and CD6 (P = 3.79 × 10−9). TNFRSF1A harbors two independent susceptibility alleles: rs1800693 is a common variant with modest effect (odds ratio = 1.2), whereas rs4149584 is a nonsynonymous coding polymorphism of low frequency but with stronger effect (allele frequency = 0.02; odds ratio = 1.6). We also report that the susceptibility allele near IRF8, which encodes a transcription factor known to function in type I interferon signaling, is associated with higher mRNA expression of interferon-response pathway genes in subjects with MS.
Multiple sclerosis is thought to emerge when genetically susceptible individuals encounter environmental triggers and initiate an inflammatory reaction against self-antigens in the central nervous system (CNS); these events result in recurring episodes of inflammatory demyelination and, in many cases, a progressive neurodegenerative process1. The genetic architecture underlying susceptibility to MS is complex, and there are no known mendelian forms. As seen with many other inflammatory diseases, the major histocompatibility complex (MHC) has long been associated with MS, and both class I and class II susceptibility alleles exist2,3. However, a recent genome-wide association study (GWAS) revealed the existence of multiple non-MHC MS susceptibility loci of modest effect4. The role of three such loci—CLEC16A, IL2RA and IL7R—has now been well validated by other investigators and by our own replication efforts5-7. Given the success of the GWAS approach in MS, we extended earlier gene discovery efforts by pooling together data from three separate genome-wide studies exploring the genetic architecture of MS and report three newly identified susceptibility loci for MS.
We conducted a meta-analysis of genome-wide data from (i) 895 subjects with MS genotyped in the original scan by the International MS Genetic Consortium4, (ii) 969 subjects with MS scanned by the GeneMSA consortium8 and (iii) an unpublished set of data generated from 860 subjects with MS recruited at the Partners MS Center in Boston, Massachusetts. A detailed description of the component sample sets is presented in Table 1, and their clinical characteristics are outlined in Supplementary Table 1a online. As each of the studies used a different genotyping platform (Table 1), we used the phased chromosomes of HapMap samples of European ancestry (CEU)9 and the MACH algorithm (see URLs section in Online Methods)10 to impute missing autosomal SNPs with a minor allele frequency >0.01 in each of the three datasets. This effort produced a dataset containing a common panel of 2.56 million SNPs in 2,624 subjects with MS and 7,220 healthy control subjects. We then implemented a meta-analysis method that combines the association results from each of the six strata of subjects outlined in Table 1, taking into account the imputation uncertainty for each SNP (see Online Methods)11. Overall, the degree of statistical inflation was modest (genomic inflation factor λ = 1.054).
To organize the top results of the meta-analysis, we assembled SNPs into groups that were highly correlated with one another (r2 > 0.5), selecting the SNP with the most extreme evidence of association (lowest P value) to serve as the representative tagging marker for that group. We list the top 100 independent loci with the lowest p values for association with susceptibility to MS, ranging from the CD58 locus (rs12025416, P = 4.74 × 10−8) to the 14q31.3 locus (rs2022771, P = 4.89 × 10−5) in Supplementary Table 2 online. Each locus is defined by a single tagging SNP and contains those polymorphisms that are in linkage disequilibrium (LD) with it. These 100 loci form the core of the SNP panel that was genotyped in the replication sample set (Table 1 and Supplementary Table 2). Given the preponderance of women over men among individuals affected with MS, we also conducted a secondary genome-wide regression analysis that included a term for gender and a term for subject source to account for the structure of our subject samples (see Online Methods). From this analysis, 41 of the top 50 loci with a term for gender were not redundant with known loci or loci selected by the primary analysis; thus, these 41 SNPs were also included in the replication panel.
To supplement the 141 susceptibility loci selected in an unbiased manner, we selected an additional 47 SNPs for replication using one of the following strategies (Supplementary Table 2). First, we screened all SNPs with a P < 10−3 in the meta-analysis and selected 32 loci that included candidate genes implicated in MS or pathologic inflammation according to a search of current literature. Second, we selected eight nonsynonymous coding SNPs (nscSNPs) with P < 10−3 in the meta-analysis that also had a P < 0.01 in an independent screen for nscSNPs in MS12. Finally, for reference, we included seven SNPs previously associated with MS at or near a genome-wide level of significance. The putative association of the rs10492972 marker in the KIF1B locus with MS susceptibility13 was not known at the time the replication panel was designed. As this locus did not offer evidence of association in our meta-analysis (P = 0.72), it was not included in the replication study. In all, we genotyped 188 SNPs in the replication samples from the UK and the US (Table 1 and Supplementary Table 1b), of which 180 SNPs provided high-quality data for subsequent Cochran-Mantel-Haenszel analysis as well as a joint analysis of the replication and meta-analysis results (Supplementary Table 3 online). The relative success of each of our SNP selection strategies is reported in Supplementary Table 3. We note that we do not have genome-wide estimates of ancestry for the subjects in the replication study, and therefore we cannot assess the level of population stratification that may exist within the separate UK and US strata of the replication samples.
Among the 180 SNPs that met quality-control criteria, we observed an excess of associations in the replication stage that was consistent with the direction of effect observed in the meta-analysis (Fig. 1). In Table 2, we present the top results of the replication analysis and the combined evidence for association of these loci. The known MHC class I and class II associations were detected in the replication samples: rs3135388 is the surrogate marker for the HLA DRB1*1501 risk allele and rs2523393 is a surrogate marker for the HLA B*4402 allele (Table 2 and Online Methods)14. Consistent with previous findings, the associations of HLA B*4402 and HLA DRB1*1501 were independent (Supplementary Table 4 online).
Outside the MHC, the previously validated associations with the CLEC16A, IL2RA and IL7R loci were observed (Table 2), and we now validate the CD58 locus at a level of genome-wide significance (P = 3.10 × 10−10) (Table 2). The best CD58 marker, rs2300747, was identified in earlier fine-mapping exercises in this locus15 and is in strong linkage disequilibrium (LD) (r2 = 0.73 in HapMap CEU samples) with the CD58 marker selected from the meta-analysis, rs12025416 (P = 1.16 × 10−9) (Supplementary Table 3). Logistic regression revealed no evidence of an independent effect at rs12025416 in the replication samples (Supplementary Table 4). Thus, we see no evidence for allelic heterogeneity at the CD58 locus in our data, and our previously identified CD58 SNP (rs2300747) remains the best marker of a susceptibility allele within the CD58 locus.
In the replication data, we found strong evidence for the presence of three previously unreported associations with genome-wide significance (P < 5 × 10−8) in the joint analysis: they are located in the TNFRSF1A (rs1800693, P = 1.59 × 10−11), IRF8 (rs17445836, P = 3.73 × 10−9) and CD6 (rs17824933, P = 3.79 × 10−9) loci. All three loci were selected for replication in an unbiased manner on the basis of the results of the meta-analysis (Supplementary Table 2). The rs1800693[T] allele associated with increased risk of MS (odds ratio (OR) = 1.20, 95% confidence interval (CI) = 1.10–1.31, in the replication stage) is found within the sixth intron of TNFRSF1A, and thus we refer to this region as the TNFRSF1A locus. Another gene, PLEKHG6, is found within the block of LD that contains rs1800693, so, formally, either or both genes could be associated with MS susceptibility (Fig. 2a). However, none of the SNPs that are in LD with rs1800693 are located within the PLEKHG6 gene region (Fig. 2a); future fine-mapping efforts and follow-up studies will be needed to definitively resolve the functional basis of this association. Nonetheless, current literature supports a role for TNFRSF1A, as it has previously been implicated in tumor necrosis factor–associated periodic syndrome (TRAPS). This rare syndrome consists of recurrent episodes of systemic inflammation with variable symptoms including fever, abdominal pain, myalgia, arthralgia, exanthema and ocular involvement16, and many affected individuals have been shown to have one of over 57 coding or noncoding mutations in TNFRSF1A (see URLs section in Online Methods). Most of these alleles are rare variants found in certain pedigrees, but a few less-penetrant alleles are segregating in European populations at < 0.05 frequency17,18. Notably, a number of subjects with demyelinating or demyelinating-like diseases have been recently reported to harbor such variants (such as the R92Q substitution), but the slight excess in the proportion of these polymorphisms in MS subjects was not significant16,19,20. In our meta-analysis, only the R92Q polymorphism (rs4149584, labeled R121Q in dbSNP) has been analyzed. It has substantial evidence of association with MS susceptibility in the meta-analysis (P = 0.0003) as well as the replication effort (P = 0.0042), and it is not in LD with rs1800693 (r2 = 0.041 in HapMap CEU samples), the common TNFRSF1A variant identified in our study (Table 2). Conditional analysis suggests that the two SNPs represent independent associations (Supplementary Table 4). Although a detailed investigation of common and rare variants in this locus is necessary to fully characterize MS-related effects, our study provides the first definitive link between the TNFRSF1A locus and susceptibility to demyelinating disease at both a high-frequency polymorphism of modest effect (rs1800693) and a low-frequency polymorphism of stronger effect (rs4149584). Altogether, rs1800693[T] and rs4149584[T] are excellent candidate risk alleles for other inflammatory diseases, particularly those with rheumatologic features.
The association of rs17445836[A] (OR = 0.80, 95% CI = 0.72–0.89) is also new and is found in a region of elevated recombination rate and lower LD (Fig. 2b). On the centromeric side, this SNP is located within 61 kb of IRF8 (interferon response factor 8; also known as interferon consensus sequence binding protein 1, ICSBP1), and we therefore refer to this association as being in the IRF8 locus because the closest telomeric gene, FOXF1, is 526 kb away. As its name implies, IRF8 is one of the several transcription factors that regulate responses to type I interferons (α and β interferons) by binding the interferon-stimulated response element (ISRE) (MIM601565). It has many roles that include involvement in B-cell germinal center development as well as macrophage cell function21,22.
The third locus contains the rs17824933[G] susceptibility allele (OR = 1.18, 95% CI = 1.07–1.30) and is bounded by two peaks of recombination (Fig. 2c) between which only one gene, CD6, is found. The excess of extreme results with modest LD (0.8 > r2 > 0.5) to rs17824933 at the telomeric end of the block of LD suggests that there may be an independent association within this locus. CD6, like CD58, is a molecule involved in T-cell costimulation and differentiation23,24; it may therefore have a role in modulating the activation and proliferation of T cells in the context of an inflammatory disease. In fact, these properties led to its targeting with a blocking monoclonal antibody in a clinical trial treating individuals with MS25. Finally, the soluble form of CD6 may also function as a pattern recognition receptor and affects the serum level of TNFα in this context in mice26. Thus, the rs17824933[G] susceptibility allele found in the first intron of CD6 may have functional repercussions that interact with those of the TNFRSF1A locus.
Four other loci with P < 10−4 in our joint analysis (Table 2) have previously validated associations with other inflammatory diseases and are therefore likely to be true MS susceptibility loci: IL12A, OLIG3-TNFAIP3, PTGER4 and RGS1. The putative IL12A and RGS1 MS susceptibility alleles are in strong LD with known celiac disease susceptibility alleles27, as is the PTGER4 MS allele and the known Crohn’s disease allele28 in this locus (Supplementary Table 5a online). On the other hand, the signal of association within the OLIG3-TNFAIP3 locus seems to be distinct from known associations to psoriasis, rheumatoid arthritis (RA) and systemic lupus erythematosus (SLE) (Supplementary Table 5a)29-32. Given the discovery of these strong candidate MS loci, we extended this comparative analysis to a larger number of loci by comparing our list of 100 top MS loci selected for replication (Supplementary Table 2) to the list of 76 Crohn’s disease loci in which replication was attempted28. We found seven loci with substantial evidence of association in both diseases: these include not only loci with a validated role in one disease (IL12B and PTGER4 in Crohn’s disease as well as IRF8 in MS) but also loci with suggested roles in both diseases (BCL2, NEDD4L, PPA2 and STAT3; Supplementary Table 5b).
Given the newly discovered association to the IRF8 locus that contains an important transcription factor involved in responses to type I interferons, we explored its possible functional consequences by investigating a set of RNA data (Affymetrix U133 2.0 array) captured from the peripheral blood mononuclear cells (PBMCs) of 240 subjects of European ancestry with either remitting-relapsing MS (RRMS, n = 230) or a clinically isolated demyelinating syndrome (CIS, n = 10), many of which go on to develop MS (Supplementary Table 1c). These subjects can be classified into three categories: untreated subjects (n = 82), interferon β (IFNβ)-treated subjects (n = 94) and glatiramer acetate (GA)-treated subjects (n = 64). We used an unbiased approach to assess these data for the hypothesis that the IRF8 locus may have a modest but broad effect on RNA expression from genes involved in interferon response. Specifically, we applied a gene set enrichment analysis (GSEA) methodology33 to explore the results of a quantitative trait (eQTL) analysis correlating rs17445836 with our genome-wide RNA expression data from subjects with MS and CIS. As IRF8 is known to be an interferon-response gene, its function could be affected either by IFNβ treatment or by GA treatment, which is reported to suppress IFNβ expression (S. Zamvil, University of California, San Francisco, personal communication). Thus, we pursued this GSEA screen of RNA data separately in each of our three subject subsets (untreated, IFNβ-treated and GA-treated). Sixteen gene sets that meet our threshold of significance (an FDR q value < 0.05) have genes that are coordinately upregulated in the presence of the rs17445836[G] allele in both the untreated and the IFNβ-treated subject subsets. Specifically, each of the 16 shared gene sets contain genes whose expression is coordinately enhanced under an additive model for rs17445836[G] association. In Table 3, we present the most associated of these 16 gene sets, that is, those gene sets that have an FDR q value < 0.001 in both sets of subjects. All eight of these most associated gene sets are primarily defined as being interferon-responding or are known to contain responses to type I interferons. Detailed results of each analysis are presented in Supplementary Table 6a,b online. Upregulation of interferon pathway genes in peripheral blood has previously been noted in ~50% of untreated subjects with MS34,35, so the overlap between the untreated and IFNβ-treated subsets suggests that these results are consistent with our current knowledge of pathophysiology in MS. The lack of replication of the results of the untreated group in the GA-treated group is intriguing; it could be due to the smaller size of this subject subset (n = 64) and/or the suppression of IFNβ expression by GA. Further validation experiments in these subject subsets are needed to confirm our observations and explore the interactions of these MS treatments with the effect of the rs17445836[G] allele.
To control for potential bias in our analysis method, we repeated this investigation of the quantitative trait analysis results using the Ingenuity Pathways Analysis software suite (see URLs section in Online Methods). Here, using the same set of quantitative trait analysis results, we find significant co-regulation, relative to the rs17445836[G] allele, of genes within Ingenuity Systems’ predefined “canonical interferon signaling pathway” among both untreated subjects (P = 0.001) and IFNβ-treated subjects (P = 0.01) (Fig. 3). The GA-treated subjects do not have a significant co-regulation of genes in this pathway. We have also repeated the GSEA and Ingenuity analyses using the best markers for MS susceptibility in the CD6, CD58 and TNFRSF1A loci that were validated in this meta-analysis; none of these three loci show significant co-regulation within the interferon pathway (data not shown) relative to the best susceptibility marker in each locus. In addition, we examined a publicly available dataset generated from a different cell type (EBV-transformed B cells) for the effect of the rs17445836[G] susceptibility allele on interferon response but did not observe this association in the small sample of HapMap cell lines of European ancestry (data not shown)36. Our data therefore suggest that both at baseline and during chronic exposure to exogenous INFβ the rs17445836[G] susceptibility allele may have a widespread but specific effect on gene expression in PBMC from subjects with MS, particularly within the interferon response pathway in which IRF8 is known to function (Fig. 3).
Only one probe in our RNA dataset provided information on the IRF8 gene itself, and this probe shows no evidence of correlation between rs17445836[G] and IRF8 expression. Thus, the mechanism by which rs17445836[G] influences gene expression remains unknown at this time, and more comprehensive studies of the expression of IRF8 and its RNA isoforms in specific cell populations are needed to address this question.
Our current data suggest that dysregulation of interferon responses may be one of the early events that contribute to the onset of MS. Upregulation of interferon responses has been noted not only in a subset of MS subjects34,35, but also in subjects with other inflammatory diseases (dermatomyositis37, rheumatoid arthritis and SLE38,39), and may reflect a shared feature of autoimmunity. However, the role of interferons in the onset of MS remains to be better defined. In addition, other pathways may also be affected by the IRF8 variant, such as a gene set defined in response to TNFα stimulation that is coordinately upregulated in untreated MS subjects with the rs17445836[G] allele of IRF8 (FDR q value < 10−4, Supplementary Table 6a). This observation suggests a link between the functional consequences of the IRF8 locus and those of the newly identified TNFRSF1A locus.
The possible role of the TNFRSF1A alleles in multiple sclerosis is informed by functional data from human studies. The TNFα pathway is implicated in MS susceptibility as a result of observations from human clinical data: treatment with monoclonal antibodies to TNFα may trigger acute episodes of CNS inflammation in subjects with MS40. A phase II clinical trial with a TNFRSF1A:IgG1 fusion protein (lenercept) also reported increased clinical attacks, although these occurred in the absence of enhanced disease activity on magnetic-resonance imaging and disability41. Furthermore, demyelinating lesions are a possible adverse event in subjects with Crohn’s disease or rheumatoid arthritis treated with monoclonal antibodies to TNFα42. Thus, genetic and functional data now merge and suggest that dysregulation of the TNFα pathway has a role in the onset of MS, with diminished TNFα activity being associated with onset of CNS inflammatory lesions in clinical data. The suggested association of the OLIG3-TNFAIP3 locus fits within this theme, as may the association of CD6: soluble CD6 may function as a pattern recognition receptor and influence circulating levels of TNFα26. With its observations relating to responses to class I interferons and TNFα, this study focuses our attention on dysregulation within the innate immune system in MS susceptibility. Dysfunction of the innate immune system, an important first line of defense against pathogens, has long been noted in immunopathology studies of MS. This history has contributed to the longstanding hypothesis of a viral or microbial trigger for MS, with the best evidence residing with the Epstein Barr Virus (EBV), and we must now consider our new associations in the context of such environmental risk factors43.
Given the nearly 3:1 preponderance of women in all of our cohorts, we also need to better understand the impact of gender on MS susceptibility. Our secondary analysis that includes a term for gender was conducted for this reason and was successful in highlighting the role of the CXCR4 locus (P = 1.37 × 10−7) (Supplementary Table 3). Although this locus was selected for replication on the basis of the secondary gender analysis, it showed strong evidence of replication in our primary replication analysis without a term for gender. The results of the secondary analysis of the replication data that includes gender as a covariate are shown in Supplementary Table 7 online. The top results of this analysis generally mirror those of the primary replication analysis.
Overall, most validated and strongly suggested MS susceptibility loci (CD6, CD58, CLEC16A, HLA-B, HLA-DRB1, IRF8, IL2RA, IL7R, IL12A, OLIG3-TNFAIP3, PTGER4, RGS1 and TNFRSF1A) have well-known and primarily immunologic functions. This is particularly true for our newly validated loci (CD6, IRF8 and TNFRSF1A) that were selected for replication in an unbiased manner. In addition, many of these MS susceptibility loci have validated roles in other inflammatory diseases. Thus, we inform the ongoing debate of the relative roles of neurodegeneration and inflammation in the onset of MS by reporting a preponderance of current genetic evidence in favor of early immune dysregulation that may trigger secondary neurodegenerative processes. A definitive evaluation of this question awaits a more complete map of genetic susceptibility factors and a more comprehensive understanding of the functions of the associated genes in different cell types. This search for further susceptibility loci is also guided by the important observation that a less common variant (frequency of ~2% in European populations) of stronger effect has now been associated with susceptibility to MS in the TNFRSF1A locus. This suggests that future investigations of complex traits in MS will have to target this class of low-frequency alleles, which are typically only poorly interrogated by the current set of genome-wide SNP genotyping platforms.
Methods and any associated references are available in the online version of the paper at http://www.nature.com/naturegenetics/.
P.L.D. is a Harry Weaver Neuroscience Scholar Award of the National MS Society (NMSS); he is also a William C. Fowler Scholar in Multiple Sclerosis Research and is supported by a National Institute of Neurological Disorders and Stroke (NINDS) K08 grant, NS46341. D.A.H. is a Jacob Javits Scholar of the US National Institutes of Health; he is also supported by NINDS P01 AI039671, R01 NS049477, R01NS046630, NMSS Collaborative MS Research Award and NMSS RG3567A. The International MS Genetics Consortium is supported by R01NS049477. L.P. is supported by an NMSS fellowship grant (FG1665-A-1). The genome-wide data on the BWH subjects and the RNA data on MS and CIS subjects from the CLIMB study were generated as part of a collaboration with Affymetrix, Inc. We thank the Myocardial Infarction Genetics Consortium (MIGen) study for the use of their genotype data as control data in our study. The MIGen study was funded by the US National Institutes of Health and National Heart, Lung, and Blood Institute’s STAMPEED genomics research program and a grant from the National Center for Research Resources. We acknowledge use of genotype data from the British 1958 Birth Cohort DNA collection, funded by the Medical Research Council grant G0000934 and the Wellcome Trust grant 068545/Z/02. We thank R. Lincoln and R. Gomez for expert specimen management at UCSF as well as A. Santaniello for database management. We thank the Accelerated Cure Project for its work in collecting samples from subjects with MS and for making these samples available to MS investigators. We also thank the following clinicians for contributing to sample collection efforts: Accelerated Cure project, E. Frohman, B. Greenberg, P. Riskind, S. Sadiq, B. Thrower and T. Vollmer; Washington University, B.J. Parks and R.T. Naismith. Finally, we thank the Brigham & Women’s Hospital PhenoGenetic Project for providing DNA samples from healthy subjects that were used in the replication effort of this study.
AUTHOR CONTRIBUTIONS P.L.D., D.A.H., S.L.H., P.M.M. and J.R.O. designed the study. P.L.D. and J.R.O. wrote the manuscript. P.I.W.d.B., P.L.D., S.R., M.J.D., D.T., J.W., S.E.B. and X.J. performed analytical work. P.I.W.d.B., X.J. and M.J.D. developed the meta-analysis method while S.R. developed the subject matching algorithm. L.O. and P.L.D. performed the quality control analysis and quantitative trait analysis of the RNA from MS PBMC samples. C.A. generated and processed genotype data for analysis. P.L.D., N.T.A., L.P., R.B., R.A.G., P.M.M., Y.N., L.K., B.U., C.P., W.L.M., D.P.S., D.E., A.H.C., A.C., S.J.S., H.L.W., S.L.H., J.R.O. and D.A.H. contributed to DNA sample collection and genetic data. J.L.M., M.A.P.-V. and J.L.H. contributed to the interpretation of the results. All authors have read and contributed to the manuscript.
Note: Supplementary information is available on the Nature Genetics website.
Reprints and permissions information is available online at http://npg.nature.com/reprintsandpermissions/