Meta-analysis was performed with genotype data from four SNPs in a sample set comprised of 1453 independent SLE cases, 3381 independent controls and 713 independent trio families. The results of the meta-analysis demonstrated significant association approaching or exceeding criteria for genome-wide significance (P
< 1 × 10−8
) for all SNPs (). SNP rs5029939 located in the second intron of TNFAIP3
and originally identified in our GWAS, produced a convincing meta-analysis P
-value of 1.67 × 10−14
and a combined odds ratio of 2.09 (95% CI: 1.68–2.60) in the case/control datasets (). In our previous study we reported an OR equal to 2.28 (95% CI: 1.71–3.06) for marker rs5029939 2
. Note that while the meta-analysis OR at rs5029939 decreased to 2.09, the 95% confidence interval around this OR was reduced, indicating an improvement in the precision of the estimate, a primary goal of meta-analysis. In SLE cases of European ancestry, HLA-DR3 and HLA-DR2 alleles are the only risk alleles to consistently demonstrate odds ratios near 2.0 or higher 16
. These results suggest that the genetic effect size marked by rs5029939 in the TNFAIP3
gene is similar to that of the HLA and larger than any of the recently identified SLE risk genes including IRF5
. Given the relatively low frequency of the risk allele at rs5029939, this effect aligns with the common disease, rare variant hypothesis of complex genetic disease.
Study Specific and Meta-analysis Association Results for Four SNPs in the Region of TNFAIP3
Genotypes were imputed to determine the contribution of untyped variants to the genetic association in the region of TNFAIP3
and to further define the boundaries of the SLE risk haplotype. Imputation was performed over a 5 Mb (135–140 Mb) interval centered on TNFAIP3
from marker rs4896151 to marker rs1977772 on chromosome 6q using our previously published GWAS dataset as the source of observed genotypes and the Phase II HapMap as the source of imputed genotypes 20
. In addition to TNFAIP3
, this interval contains at least 20 genes, some with a possible role in the immune system such as interleukin 20 receptor alpha (IL20RA
), interleukin 22 receptor alpha (IL22RA
), interferon gamma receptor 1 (INFGR1
) and mitogen-activated protein kinase kinase kinase 5 (MAP3K5
). Also included was the region upstream of TNFAIP3
associated with risk for RA 6; 7
and the region downstream of TNFAIP3
, recently reported to be associated with SLE 8
. Imputation expanded the number of SNPs in the 5 MB region from 390 observed SNPs to 3,670 total SNPs. Following exclusion of imputed SNPs based on quality control measures (information scores < 0.7 and/or ≤ 2 proxy SNPs used to impute any give SNP (NPRX)), 2497 SNPs remained in the final imputed dataset ().
Figure 1 Results of Imputation Across a 5MB Region Centered on TNFAIP3. A. Results showing full 5MB imputation interval. Imputed SNPs are in red and observed SNPs in blue. Locations of genes flanking TNFAIP3 are indicated at the top. B. Expanded view of region (more ...)
The strongest association signals were detected in the vicinity of TNFAIP3
() with both observed and imputed SNPs. No other region in the 5MB interval reached significance at P
< 1.0 × 10−4
, including variants in the region 185 kb upstream associated with RA and SLE or the region near PERP
(Supplemental Table 1
). In contrast, eleven imputed SNPs spanning the TNFAIP3
locus demonstrated association with SLE at P
< 1.0 × 10−4
(). Imputation accuracy for all eleven SNPs was greater than 99%. The concordance rates between observed genotypes and imputed genotypes for the three observed SNPs (rs10499197, rs5029939, rs7749323) exceeded 99%, indicating robust imputation over this region. SNP rs5029939 was the most statistically significant variant among all observed and imputed SNPs (). The exon three missense SNP, rs2230926, was not imputed as it was not present in either the GWAS or HapMap datasets, however, rs5029939 is in strong LD (r2
=0.99) and may indeed incorporate the effect at rs2230926 2
Eleven imputed SNPs and three observed SNPs are associated with SLE in the region of TNFAIP3
Imputation also defined the length of the associated TNFAIP3 risk haplotype. Before imputation, association with SNPs on the 3′ end extended as far as rs7749323 and following imputation, additional SNPs extended the risk haplotype approximately 12 kb downstream to marker rs6932056, resulting in risk segment approximately 109 kb in length.
The distribution of allele frequencies and odds ratios for the imputed SNPs was consistent with the presence of more than one haplotype. Therefore, we evaluated the haplotypic and LD relationships for the observed and imputed SNPs listed in . Three haplotypes with frequency greater than 1% were identified (). Conditional logistic regression analysis implemented in PLINK 21
was used to determine if the haplotypes contributed independent genetic risk for SLE using haplotype 1 as the reference haplotype. The omnibus likelihood ratio test (LRT) yielded a P
-value = 4.0 × 10−4
, consistent with the fact that variants in the region of TNFAIP3
influence risk for SLE. The analysis demonstrated an independent effect for haplotype 3 (LRT P
= 1.0 × 10−4
) but not haplotype 2 (LRT P
= 0.55) (). Conditioning on haplotype 3 in comparison to the reference haplotype, resulted in no evidence of association (LRT P=0.42). In contrast, significant evidence of association was seen when the reference haplotype was conditioned on haplotype 2 (LRT P
) (). These results support the conclusion that genetic variation carried on haplotype 3 is responsible for the association with SLE. As was seen in the meta-analysis, SNP alleles carried exclusively on haplotype 3 produced OR ≥ 2.0.
Figure 2 Conditional haplotype analyses for the imputed TNFAIP3 risk haplotype. Three haplotypes are shown with frequencies > 1%. Imputed SNPs are in black font and observed SNPs are in blue font. LD relationships (r2) are shown in the figure below the (more ...)
Clinical data were available for 1351 female SLE cases of European descent and was used to define SLE sub-phenotypes based on revised ACR criteria (malar rash, discoid rash, photosensitivity, oral ulcers, arthritis, serositis, nephritis, neurologic disorder, hematologic disorder, antinuclear antibody, and immunologic disorder) 22; 23
and presence of anti-Ro/SSA and anti-La/SSB autoantibodies (). Case subsets were compared to a group of 1172 female controls without a personal or family history of autoimmune disease. For comparison of anti-Ro/SSA and anti-La/SSB antibodies the control group consisted of 348 subjects that were negative for these autoantibodies by serologic testing. Association analysis was performed by comparing of the frequencies of the risk (C) and non-risk (G) genotypes at rs5029939, which tags the SLE risk haplotype. There were 144 SLE cases (Freq. = 0.057) and 71 controls (Freq. = 0.031) that carried the CG genotype. Frequencies for the CC genotype in cases and controls were low (CC Cases = 0.006, CC Controls = 0) precluding analysis of the CC genotype.
Association of alleles at rs5029939 with SLE sub-phenotypes compared to healthy controls.
Of the sub-phenotypes evaluated, nephritis and hematologic disorder demonstrated lower P-values and higher attributable risk and odds ratios compared to the SLE phenotype, even though only 28% and 56% of the cases, respectively, were used in the analyses of these phenotypes (). Note that the analysis of the SLE phenotype without the nephritis cases or the hematologic cases resulted in an increase in the P-value from 3.75×10−5 to 0.0012 and 0.01, respectively (). Excluding both the nephritis and hematologic sub-phenotypes from the SLE phenotype resulted in a non-significant association with the CG genotype of rs5029939 (P-value = 0.053). Taken together, these results suggest that SLE patients with the CG genotype at rs5029939 are over two fold more likely to develop lupus nephritis and/or hematologic manifestations compared to SLE patients with the GG genotype.
Conditional analysis of clinical traits
We then performed an analysis of SLE cases only, stratified by SLE sub-phenotypes. This analysis failed to produce any statistically significant associations for any of the sub-phenotypes (Supplementary Table 2
). This is likely due to reduced statistical power due to the smaller sample sizes that result when using only SLE cases. Considering the nephritis sub-phenotype for example, power analyses suggest that we would need approximately 560 cases (SLE with nephritis) to detect an effect size similar to the case-control results, whereas our data included 379 lupus nephritis subjects (Supplementary Table 2
Next we evaluated if clusters of clinical sub-phenotypes were associated with the risk allele at rs5029939. To define the clusters we used a principle components approach which produced five clusters from 10 of the 11 ACR criteria evaluated, the first three of which explained 56.5% of the total variation. Anti-nuclear antibodies were present in 98% of the case subjects, which precluded clustering of this sub-phenotype. Overall, the sub-phenotypes within each of the five clusters (Cluster 1 - malar rash, photosensitivity, and oral ulcers; Cluster 2 - renal, immunologic and hematologic manifestations; Cluster 3 - arthritis and serositis; Cluster 4 - neurologic disorder; Cluster 5 - discoid rash) were moderately correlated (0.33 – 0.57), but no correlation was observed with variables outside their respective clusters (Supplementary Table 3
). A component score was then estimated for each cluster and each case subject using the principle components derived from the clustering procedure, generating five new covariates for each case subject. Logistic regression analysis was performed using rs5029939 (omitting homozygous risk individuals) as the dependent variable and SLE and the cluster component score as independent variables (). Only the results for the first three clusters are reported as clusters 4 and 5 explained only 2.2% of the total variation and sub-phenotypes within these clusters were not associated with rs5029939. In line with our analysis using individual sub-phenotypes, cluster 2 (renal, hematologic and immunologic manifestations) demonstrated a better fit to the model when compared to cluster 1, cluster 3, or SLE (). Importantly, when SLE was adjusted for cluster 2, association with rs5029939 was insignificant (Wald P
= 0.81), yet association with SLE remained when adjusting for cluster 1 (P
= 0.04) or cluster 3 (P
= 0.06). These results suggest the association between rs5029939 and subjects with renal, hematologic and immunologic manifestations is not due to confounding with SLE but rather represents a sub-phenotype specific genetic effect.
Logistic regression of rs5029939 with SLE and cluster component scores.