|Home | About | Journals | Submit | Contact Us | Français|
Genome-wide association studies have recently identified at least 15 susceptibility loci for systemic lupus erythematosus (SLE). To confirm additional risk loci, we selected SNPs from 2,466 regions that showed nominal evidence of association to SLE (P < 0.05) in a genome-wide study and genotyped them in an independent sample of 1,963 cases and 4,329 controls. This replication effort identified five new SLE susceptibility loci (P < 5 × 10−8): TNIP1 (odds ratio (OR) = 1.27), PRDM1 (OR = 1.20), JAZF1 (OR = 1.20), UHRF1BP1 (OR = 1.17) and IL10 (OR = 1.19). We identified 21 additional candidate loci with P ≤ 1 × 10−5. A candidate screen of alleles previously associated with other autoimmune diseases suggested five loci (P < 1 × 10−3) that may contribute to SLE: IFIH1, CFB, CLEC16A, IL12B and SH2B3. These results expand the number of confirmed and candidate SLE susceptibility loci and implicate several key immunologic pathways in SLE pathogenesis.
Systemic lupus erythematosus (SLE) is a chronic inflammatory autoimmune disease characterized by the presence of antibodies to nuclear self-antigens. Many of the lupus autoantibodies recognize nucleic acids and nucleic acid binding proteins, which in turn activate Toll-like receptors, leading to the production of type I interferon1. Despite considerable clinical heterogeneity, SLE ranks among the most heritable of common autoimmune diseases, with a sibling risk ratio of ~30 (ref. 2). Recent genome-wide association (GWA) and candidate gene studies have identified at least 15 common SLE risk alleles that achieve genome-wide significance (P < 5 × 10−8). These include genes encoding proteins important for adaptive immunity and the production of autoantibodies (HLA class II alleles, BLK, PTPN22 and BANK1) and proteins with roles in innate immunity and interferon signaling (ITGAM, TNFAIP3, STAT4 and IRF5)3–10. To identify additional risk loci, we performed a targeted replication study of SNPs from 2,466 loci that showed a nominal P value of <0.05 in a recent GWA7 scan of 1,310 individuals with lupus (cases) and 7,859 controls. We also genotyped SNPs from 23 previously reported SLE risk loci, 42 SNPs implicated in other autoimmune diseases and over 7,000 ancestry-informative markers (Fig. 1).
We designed a custom SNP array (Illumina Infinium II) consisting of over 12,000 variants and genotyped two independent SLE case and control populations from the United States (1,129 SLE cases and 2,991 controls) and Sweden (834 SLE cases and 1,338 controls). Included among the US controls were 2,215 Alzheimer’s disease case-control samples, which were judged to be acceptable as controls because the genetic factors underlying SLE and Alzheimer’s disease are expected to be independent. We next applied data quality filters to remove poorly performing samples and SNPs, population outliers, duplicate samples and related individuals (see Online Methods). Following these quality control measures, we examined a final set of 10,848 SNPs (Fig. 1). Association statistics for 3,735 SNPs were calculated and corrected for population stratification using 7,113 ancestry-informative markers (see Online Methods).
We first examined 25 SNPs (from 23 loci) that were previously reported to be associated with SLE (Table 1 and Supplementary Table 1). We found further evidence of association for 21 of the variants (P < 0.05), including 9 loci that reached genome-wide significance (P < 5 × 10−8) in the current combined dataset. Among the loci with genome-wide significant results were HLA-DRB1 (HLA*DR3 or DRB1*0301), IRF5, TNFAIP3, BLK, STAT4, ITGAM, PTPN22, PHRF1 (also called KIAA1542) and TNFSF4 (also called OX40L). The analysis also provided additional evidence of association for variants at nine loci for which a single previous study reported genome-wide levels of significance: HLA-DRB1 (HLA*DR2 or DRB1*1501), TNFAIP3 (rs6920220), BANK1, ATG5, PTTG1, PXK, FCGR2A, UBE2L3 and IRAK1-MECP2.
An earlier candidate gene study9 identified MECP2 as a potential risk locus for SLE; however, in the current dataset, SNPs near IRAK1, a gene critical for Toll-like receptor 7 and 9 signaling and located within the identified region of linkage disequilibrium (LD) surrounding MECP2, showed the strongest evidence of association. Similar findings were recently reported11, and further work will be required to determine the causal allele in the IRAK1-MECP2 locus. We found additional evidence of association for three loci (TYK2, ICA1 and NMNAT2) that had previously shown significant but not genome wide–level evidence for association6,10. For four previously implicated variants (LYN, SCUBE1, TLR5 and LY9), no evidence of association was observed in the combined dataset.
To identify previously unknown SLE risk loci, we examined 3,188 SNPs from 2,446 distinct loci that showed evidence of association to SLE in our genome-wide dataset7, which comprised 502,033 SNPs genotyped in 1,310 SLE cases and an expanded set of 7,859 controls. Using this dataset, we imputed over 2.1 million variants using Phase II HapMap CEU samples as a reference (see Online Methods) and generated a rank-ordered list of association statistics. Variants with P < 0.05 were selected for possible inclusion on the custom replication array. For efficient genotyping, we identified groups of correlated variants (r2 > 0.2) and then carried out selection of at least two SNPs from each group where all SNPs had P < 0.001. For the remaining groups, the SNP with the lowest P value in the group was included. In the replication samples, we calculated the association statistics (see Online Methods) and observed a significant enrichment of the replication results relative to the expected null distribution (Fig. 2). Excluding previously reported SLE risk alleles, there were 134 loci with P < 0.05 (64 expected; P = 2 × 10−15) and 12 loci with P < 0.001 (1 expected; P = 1 × 10−9), suggesting the presence of true positive associations.
The replication study identified five new SLE risk loci with a combined P value that exceeded the genome-wide threshold for significance (P < 5 × 10−8): TNIP1, PRDM1, JAZF1, UHRF1BP1 and IL10 (Table 2 and Supplementary Table 2). These loci are discussed in more detail below.
A variant (rs7708392) on 5q33.1 that resides within an intron of TNIP1 (encoding TNF-α-induced protein 3 (TNFAIP3)-interacting protein 1) was significantly associated with SLE in all three cohorts and had a combined P = 3.8 × 10−13 (Fig. 2). Variants near TNIP1 were recently found to contribute to risk of psoriasis12; however, the SLE and psoriasis risk variants are separated by 21 kb and appear to have distinct genetic signals (r2 = 0.001). TNIP1 and TNFAIP3 are interacting proteins13, but the precise role of TNIP1 in regulating TNFAIP3 is unknown. The association of multiple distinct variants near TNFAIP3 with SLE4,14, rheumatoid arthritis15, psoriasis12 and type 1 diabetes16 suggests that this pathway has an important role in regulating autoimmunity.
A second confirmed risk variant (rs6568431, P = 7.12 × 10−10) was identified in an intergenic region between PRDM1 (PR domain containing 1, with ZNF domain, also known as BLIMP1) and ATG5 (APG5 autophagy 5-like). The signal at rs6568431 appears to be distinct from the previously reported6 SLE risk allele within ATG5 (rs2245214, Table 1), as rs6568431 has an r2 < 0.1 with rs2245214, and rs2245214 remains significantly associated with SLE (P < 1 × 10−5) after conditional logistic regression incorporating rs6568431 (Fig. 2).
The promoter region of JAZF1 (juxtaposed with another zinc finger gene 1) is a third newly confirmed SLE locus (rs849142, P = 1.54 × 10−9). Of note, this same variant was previously associated with risk of type 2 diabetes17 and with height variation18. A separate prostate cancer risk allele near JAZF1 (rs10486567)19 showed no evidence for association in the current study.
A fourth newly identified SLE risk locus is defined by a nonsynonymous allele (R454Q) of UHRF1BP1 (ICBP90 binding protein 1; rs11755393, P = 2.22 × 10−8). This allele encodes a nonconservative amino-acid change in a putative binding partner of UHRF1, a transcription and methylation factor linked to multiple pathways20. The UHRF1BP1 risk allele is in a region of extended LD that encompasses multiple genes, including SNRPC (small nuclear ribonucleoprotein polypeptide C), which is part of a RNA processing complex often targeted by SLE autoantibodies.
The fifth newly identified SLE locus is IL10 (interleukin-10; rs3024505, P = 3.95 × 10−8; Fig. 2). IL10 is an important immunoregulatory cytokine that functions to downregulate immune responses21, and variation in IL10 has inconsistently been reported to be associated with SLE22. The variant associated with SLE is identical to a SNP recently identified as contributing to risk of ulcerative colitis23 and type 1 diabetes24, suggesting the possibility of shared pathophysiology in the IL10 pathway across these disorders.
Using a significance threshold of P < 1 × 10−5 in the combined replication sample, we identified 21 additional SLE candidate risk loci (Table 2 and Supplementary Table 2). Less than one locus (0.01 loci, specifically) with P < 1 × 10−5 was expected under a null distribution for the meta-analysis (P = 8 × 10−77), suggesting that several of these loci are likely to be true positives for association to SLE. Notable candidate genes in this list include: (i) IRF8 (interferon regulatory factor 8), which was implicated in a previous GWA study (GWAS)4 and whose family members IRF5 and IRF7 are within confirmed SLE risk loci; (ii) TAOK3 (TAO kinase 3), a kinase expressed in lymphocytes, and the disease-associated variant is a missense allele (rs428073, N47S); (iii) LYST (lysosomal trafficking regulator), mutations of which cause Chediak-Higashi syndrome in humans, a complex disorder characterized by a lymphoproliferative disorder; and (iv) IL12RB2 (interleukin 12 receptor, beta 2), a locus which includes IL23R and SERPBP1 but appears distinct from the IL23R variants reported in inflammatory bowel disease, psoriasis and ankylosing spondylitis25.
A noteworthy feature of recent GWAS is the large number of overlapping loci found to be shared between different complex diseases26. We tested 42 variants from 35 loci that were previously reported as autoimmune-disease risk alleles for association with SLE (Table 3 and Supplementary Table 3). No single locus had an unadjusted P value < 5 × 10−8; however, we found an enrichment of previously identified disease-associated alleles. From the 35 loci tested (42 total variants), there were 5 alleles with unadjusted P < 0.0004 (less than 1 expected by chance, P = 4.4 × 10−12) and with P < 0.05 after a Bonferroni correction for the 35 pre-specified loci. For each of the five variants, the SLE-associated allele matched a previously reported allele and had the same direction of effect (Table 3). We observed a highly significant association to SLE of a missense allele of IFIH1 (rs1990760, P = 3.3 × 10−7) that has previously been associated with type 1 diabetes and Graves’ disease27,28. We also observed an association of SLE with a missense allele (R32Q) of CFB (complement factor B, rs641153) that resides in the HLA class III region and is a validated risk allele for age-related macular degeneration29. This missense allele in CFB is not in significant LD with other HLA region variants associated with SLE (DR2/DR3) and remained significant (P < 0.05) after conditional logistic regression analyses that incorporated DR2 and DR3. The HLA is a complex genetic region, but it is noteworthy that the rs641153 SNP has a protective effect nearly identical to that of the reported age-related macular degeneration (AMD) risk allele29. Further validation of the five candidate disease alleles is required.
Using 26 SLE risk alleles (21 previously reported loci in Table 1 plus the 5 newly identified SLE loci), we performed several additional analyses. First, we performed pairwise interaction analysis with the previously confirmed loci, and, consistent with previous literature from SLE6 and other complex diseases30, we observed no evidence for non-additive interactions. Using conditional logistic regression analyses, we found no evidence for multiple independent alleles contributing to risk at any of the individual risk loci. We next estimated the percent of variance explained by each of the confirmed SLE risk alleles using previously described methods30. HLA-DRB1 (HLA*DR3), IRF5 and STAT4 were each estimated to account for over 1% of the genetic variance, whereas the remaining loci each accounted for less than 1% of the variance. Together, the 26 SLE risk loci explain an estimated 8% of the total genetic susceptibility to SLE.
Targeted replication of GWAS results is an efficient study design to confirm additional risk loci31. However, there are few available data as to the probability of replicating results that fall short of accepted P value criteria for genome-wide significance. In the current study, all variants with P < 0.05 from the original GWAS were included for replication. The lower a locus’ P value is in the GWAS, the higher is the probability of that locus reaching candidate or confirmed status in the replication meta-analysis (Fig. 3). Of note, no candidate or confirmed loci were obtained in the current study from the group of variants with a GWAS P value between 0.05 and 0.01, despite accounting for ~50% of all variants tested in the replication. These results may be useful in helping guide future targeted study designs, although clearly the size of the original GWAS population, the replication sample size, the disease architecture and the effect size of the candidate disease-associated variants need to be carefully considered in planning replication efforts.
These data provide further evidence that common variation in genes that function in the adaptive and innate arms of the immune system are important in establishing SLE risk. Although each of the identified alleles accounts for only a fraction of the overall genetic risk, these and other ongoing studies are providing new insight into the pathogenesis of lupus and are suggesting new targets and pathways for drug discovery and development.
Methods and any associated references are available in the online version of the paper at http://www.nature.com/naturegenetics/.
We thank the many affected individuals and physicians who contributed DNA samples and clinical data for this study; M.I. Kamboh and P. Davies for the use of Alzheimer’s disease samples as controls in our study; B. Neale for assistance in the percent of genetic variance explained calculation; and S. Sanna and C. Willer for assistance in generating regional association plots. Genotyping of the Swedish samples by the 12K chips was performed using equipment of the SNP technology platform in Uppsala. We thank C. Enström and A.-C. Wiman for assistance with genotyping. Financial support was obtained from the Swedish Research Council for Medicine, the Knut and Alice Wallenberg Foundation the Swedish Rheumatism Association, the King Gustaf V 80th Birthday Foundation, COMBINE, and a Target Identification in Lupus (TIL) grant from the Alliance for Lupus Research, US. This work was supported in part by R01 AR44804, K24 AR02175, the Mary Kirkland Center for Lupus Research, RO1 AR43727 and Institute for Clinical and Translational Research UL1RR025005. These studies were performed in part in the General Clinical Research Center, Moffitt Hospital, University of California, San Francisco, with funds provided by the National Center for Research Resources, 5 M01 RR-00079, US Public Health Service.
Note: Supplementary information is available on the Nature Genetics website.
AUTHOR CONTRIBUTIONSV.G. and J.K.S. performed the primary statistical analyses and contributed to initial manuscript preparation; J.K.S managed DNA samples and performed genotyping. G.H. contributed to the statistical analyses and experimental design. K.E.T. and S.A.C. performed statistical analyses and contributed to manuscript preparation. X.S., W.O. and R.C.F. managed DNA samples and contributed to experimental design. G.N., I.G., E.S., L.P., G.S., A.J., A.A.B., S.R.-D., E.C.B, E.E.B., G.S.A., J.C.E., R.R.-G., G.M. Jr., J.D.R., L.M.V., R.P.K., S.M. and M.A.P. provided samples and phenotype information. A.L. managed samples and oversaw genotyping efforts. P.K.G. provided samples and contributed to the initial manuscript preparation. M.F.S. and R.K. contributed statistical analyses and contributed to the selection of the ancestry-informative markers. L.R., L.A.C. and A.-C.S. contributed samples, input into experimental design, data interpretation and initial manuscript preparation; A.-C.S. oversaw genotyping efforts. R.R.G. and T.W.B. contributed to experimental design and interpretation, statistical analyses and initial manuscript preparation. All authors contributed to the final paper.
COMPETING INTERESTS STATEMENT
The authors declare competing financial interests: details accompany the full-text HTML version of the paper at http://www.nature.com/naturegenetics/.
Published online at http://www.nature.com/naturegenetics/.
Reprints and permissions information is available online at http://npg.nature.com/reprintsandpermissions/.