|Home | About | Journals | Submit | Contact Us | Français|
The major histocompatibility complex (MHC) on chromosome 6p21 is a key contributor to the genetic basis of systemic lupus erythemathosus (SLE). Although SLE affects African Americans disproportionately compared to European Americans, there has been no comprehensive analysis of the MHC region in relationship to SLE in African Americans. We conducted a screening of the MHC region for 1,536 single nucleotide polymorphisms (SNPs) and the deletion of the C4A gene in a SLE case-control study (380 cases, 765 age-matched controls) nested within the prospective Black Women’s Health Study. We also genotyped 1,509 ancestral informative markers throughout the genome to estimate European ancestry in order to control for population stratification due to population admixture. The most strongly associated SNP with SLE was the rs9271366 (odds ratio, OR = 1.70, p = 5.6×10−5) near the HLA-DRB1 gene. Conditional haplotype analysis revealed three other SNPs, rs204890 (OR = 1.86, p = 1.2×10−4), rs2071349 (OR = 1.53, p = 1.0×10−3), and rs2844580 (OR = 1.43, p = 1.3×10−3) to be associated with SLE independent of the rs9271366 SNP. In univariate analysis, the OR for the C4A deletion was 1.38, p = 0.075, but after simultaneous adjustment for the other four SNPs the odds ratio was 1.01, p = 0.98. A genotype score combining the four newly identified SNPs showed an additive risk according to the number of high-risk alleles (OR = 1.67 per high-risk allele, p< 0.0001). Our strongest signal, the rs9271366 SNP, was also associated with higher risk of SLE in a previous Chinese genome-wide association study (GWAS). In addition, two SNPs found in a GWAS of European ancestry women were confirmed in our study, indicating that African Americans share some genetic risk factors for SLE with European and Chinese subjects. In summary, we found four independent signals in the MHC region associated with risk of SLE in African American women.
Systemic lupus erythematosus (SLE) is a debilitating chronic autoimmune disease characterized by autoantibody production and the involvement of multiple organs. The existence of a heritable basis for SLE has long been recognized; the risk of SLE to siblings of an SLE proband (sibling risk ratio, λS) has been estimated to be as high as 29-fold the risk of the background population (Alarcon-Segovia et al. 2005), and monozygotic twins show a higher concordance rate (24%) than dizygotic twins (2%) (Deapen et al. 1992).
To date, the strongest associations of genetic factors with SLE are with variants located in the major histocompatibility complex (MHC) on the short arm of chromosome 6 (6p21.3) that contains hundred of genes, many of them with immune-related functions (The MHC sequencing consortium 1999; Campbell and Trowsdale 1993; Horton et al. 2004). Before the advent of genome-wide association studies (GWAS), serologic types of the human leukocyte antigen (HLA) genes were found to be associated with risk of SLE. In particular, the HLA class II HLA-DRB1*1501 (serologic type HLA-DR2) and HLA-DRB*0301 (serologic type HLA-DR3) alleles have been consistently found to be associated with risk of SLE in populations of European ancestry and these alleles confer up to a three-fold increase in the risk of SLE (Criswell 2008; Harley et al. 1998; Tan and Arnett 1998).
Variations of complement genes within the MHC region, have also been associated with risk of SLE. Numerous studies have reported that the null allele of the C4 gene (C4A*Q0) confers higher risk of SLE (Moser et al. 2009; Walport 1993). It is the strongest genetic risk factor for SLE shared by African Americans, European Americans, and northern Europeans. The association with SLE in African Americans is most specific for a deletion of the C4A gene that extends from the 5′ region of the C4A gene through the entire flanking CYP21A2 gene and terminates in the 5′ region of the C4B gene.
Recent GWAS (Graham et al. 2008; Han et al. 2009; Harley et al. 2008; Hom et al. 2008) and high-density screenings (Barcellos et al. 2009; Rioux et al. 2009b) have identified several independent single nucleotide polymorphisms (SNPs) in the MHC region associated with risk of SLE. However, those studies have been performed in European or East Asian ancestry populations. The relationship between genetic variants in the MHC region and risk of SLE in African American women, a population at high risk for SLE compared with European Americans (Helmick et al. 2008), is still unknown.
We conducted a screening of the MHC region in a nested case-control study in African American women to identify genetic variants associated with risk of SLE. In addition, we typed the deletion of the C4A gene to assess its combined effect with MHC genetic variants.
We conducted a nested case-control study within the ongoing prospective Black Women’s Health Study (BWHS) that has been described elsewhere (Palmer et al. 2003). Briefly, the study began in 1995 when women 21–69 years of age from across the United States completed a 14-page postal health questionnaire. The initial cohort comprises 59,000 women who self-identified as “black” and had a valid address. Follow-up questionnaires are sent every two years. Follow-up of the baseline cohort through six completed 2-year cycles to date is greater than 80%.
We obtained DNA samples from BWHS participants using the mouthwash-swish method (Cozier et al. 2004). Approximately 50% of participants, 27,200 women, provided a sample. Women who provided samples were slightly older than women who did not, but the two groups were similar with regard to educational level, geographic region of residence, smoking status, body mass index, and a wide variety of other factors.
We used medical records and information from the physician to confirm self-reported cases of SLE (McAlindon et al. 2003). Physicians were asked to complete a checklist of the American College of Rheumatology (ACR) criteria for SLE. We have defined “definite” SLE as cases with 4 or more ACR criteria; “probable” cases as having 3 criteria; and “clinical” cases as those confirmed by the physician as having SLE without listing criteria or for which SLE medications were being used and the woman did not report discoid lupus or rheumatoid arthritis. The present study includes 400 validated SLE cases (220 definite, 49 probable, 131 clinical). We matched each case with 2 controls randomly selected from among BWHS participants who did not report SLE and who had provided saliva samples; matching factors were birth year (+/− 1 year) and region of residence in the U.S. (Northeast, South, Midwest, West). We matched a total of 400 cases with 800 controls.
We used the Illumina GoldenGate MHC panel developed by the International MHC and Autoimmunity Genetics Network (IMAGEN) consortium (de Bakker et al. 2006; Rioux et al. 2009a). This panel consists of 1,536 SNPs that tag both classic HLA types and common genetic variation through the 3.44 Mb of the classic MHC region (de Bakker et al. 2006; Rioux et al. 2009a). Because this MHC panel was designed to tag common genetic variation in European ancestry populations, for the purpose of assessing African Americans 397 SNPs were replaced by SNPs that better tag genetic variation in the HapMap YRI population. We also assessed the 30 kb deletion of the C4A gene.
We used the phase 3 admixture panel to estimate and control for population stratification due to European admixture. The phase 3 panel consists of 1,509 ancestral informative markers (AIMs) that have high allele frequency differences between African and European continental populations. The admixture panel is based on original sets described by Smith et al. (Smith et al. 2004) and Reich et al. (Reich et al. 2005), and further improved by mining of AIMs from Hinds et al.(Hinds et al. 2005) and from the phase 2 International HapMap project (Frazer et al. 2007).
DNA was isolated from mouthwash swish samples provided by SLE cases and controls at the Boston University Molecular Core Genetics Laboratory using the QIAAMP DNA Mini Kit (Qiagen). Whole genome amplification was performed with the Qiagen RePLI-g Kits using the method of multiple displacement amplification. Amplified samples underwent purification and PicoGreen quantification at the Broad Institute Center for Genotyping and Analysis (Cambridge, MA) before being plated for genotyping.
Genotyping of the MHC panel and the phase 3 admixture panel was carried out at the Broad Institute Center for Genotyping and Analysis using Illumina GoldenGate technology. We included 98 blinded duplicate samples to assess reproducibility of the genotypes. An average reproducibility of 99% was obtained among the blinded duplicates. We excluded all SNPs with calling rate < 90%, a deviation from Hardy-Weinberg equilibrium in the control samples at p < 0.001, or a minor allele frequency < 0.01. We also excluded samples with calling rates < 80%. Genotyping of the deletion of the C4A gene was performed in the laboratory of Dr. Fraser using long PCR with a specific primer design to detect the 30 kb deletion of most of the C4A gene (Grant et al. 2000). We genotyped by restriction fragment length polymorphism methodology (Fraser et al. 2000) a panel of DNA samples known to possess the C4A gene deletion as positive controls. Using these quality control criteria, we successfully genotyped 1,286 SNPs in the MHC region, 1,174 AIMs, and the C4A deletion in 380 SLE cases and 765 controls. The mean call rate in the final data set for both SNPs and samples was 99.5%.
We used PLINK (Purcell et al. 2007) version 1.06 to calculate summary statistics for the genotype data. We tested for association between MHC SNPs and SLE using the 1-df Cochran-Armitage trend test of an additive genetic model with 100,000 permutations to estimate empirical p-values and to correct for multiple testing; a MHC-wide significant result was declared at p-permutated < 0.05.
We used the conditional haplotype method (Valdes et al. 1997; Valdes and Thomson 1997) to detect secondary signals that are independent of the MHC-wide significant SNPs. The conditional haplotype method stratifies by haplotypic background and tests the null hypothesis that one or more SNPs have no independent haplotypic effect once we condition for such background. We first selected SNPs with p<0.01 based on univariate analysis; we then applied the conditional haplotype method conditioning on MHC-wide significant SNPs to test for independent effects of secondary SNPs. A significant secondary signal was declared if the p-value from the conditional haplotype method was <0.001 in agreement with previous use of the method in the MHC region (Barcellos et al. 2009).
We used PROC LOGISTIC of the SAS statistical software version 9.1.3 (SAS Institute Inc., Cary, NC, USA) to estimate odds ratios (ORs) and 95% confidence intervals (95% CI) for MHC-wide significant SNPs as well as for secondary SNPs identified through the conditional haplotype method and the C4A deletion. First, we ran individual models for each SNP and the C4A deletion. Then, we ran a combined model that contained all the identified SNPs plus the C4A deletion to test whether they have independent effects on risk of SLE after simultaneously adjusting for each other genetic variant. We adjusted the ORs for age, geographical region of residence (Northeast, South, Midwest, West), place of birth (US, foreign country), and European admixture proportion. We estimated individual European admixture proportions using a Bayesian approach as implemented in the Admixmap software (Hoggart et al. 2003; McKeigue et al. 2000). We used previously published data (Freedman et al. 2006; Reich et al. 2005; Smith et al. 2004) and data from the International HapMap Project (Frazer et al. 2007) to estimate allele frequencies of each AIM in the parental populations.
We calculated a multi-loci genotypic score to estimate the joint effects of the identified polymorphisms that have independent effects. Each individual SNP was coded as 0, 1, and 2 based on the number of SLE alleles that increased risk; the genotype score consisted of the sum of SLE risk alleles of the identified SNPs. We used logistic regression to estimate ORs and 95% CI of the different categories of the genotype score, using as reference the category with the lowest number of SLE risk alleles.
Table 1 presents general characteristics of SLE cases and controls. SLE cases had a slightly lower percentage of European ancestry as compared to controls (19.7% vs. 20.9%, p = 0.10).
Figure 1 displays results of the single marker associations with SLE in the MHC region. We found SNP rs9271366 in the intergenic region between the HLA-DRB1 and HLA-DQA1 genes to be associated with SLE at an MHC-wide level (p for trend = 5.6×10−5, p-permutated = 0.05). Conditional haploype analysis revealed three secondary signals that were independent of the SNP rs9271366: the SNP rs2844580 (p-conditional = 0.0009) about 9 kb upstream of the HLA-B gene, the SNP rs204890 (p-conditional = 0.0007) inside the ATF6B gene, and the SNP rs2071349 (p-conditional = 0.0004) inside the HLA-DPB1 gene. The deletion of the C4A gene was more frequent in SLE cases than controls (6.5% vs. 4.8%, p = 0.075). We found that the four newly identified SNPs remained significantly associated with risk of SLE even after simultaneous adjustment in a combined model that contained the four SNPs plus the C4A deletion. The deletion of the C4A gene was not associated with risk of SLE after simultaneous adjustment for the other four SNPs (OR = 1.01). A summary of the results and odds ratios are shown in Table 2.
Linkage disequilibrium values (D′ and r2) among the four SNPs and the deletion of the C4A gene, shown in Table 3, indicate that these variants are not in linkage disequilibrium with each other and the four newly identified SNPs most likely represent independent signals in the MHC region.
We explored the joint effect of the four newly identified SNPs on the risk of SLE by means of a genotype score based on the number of increasing risk alleles (Table 4). The C4A deletion was not included in the genotype score because the combined analysis showed no independent effect of the C4A deletion after adjustment for the other four SNPs. SLE cases had a higher average genotype score compared to controls (4.6 vs. 4.2 risk alleles respectively, p for differences of means < 0.0001). We found that risk of SLE increased in an additive way according to the number of risk alleles; for each additional risk allele, risk of SLE increased by 67%; OR (95% CI) = 1.67 (1.46–1.92). Subjects in the highest category of genotype score (≥6 risk alleles) had almost five times the risk of SLE as compared to subjects in the reference category (≤3 risk alleles) OR (95% CI) = 4.65 (2.82–7.69). No major difference in results were observed when we restricted our analysis to SLE cases diagnosed during follow up or within the five year period before baseline (N = 207), cases meeting at least 3 ACR criteria (N = 250), cases meeting at least 4 ACR criteria (N = 202), or to clinical subgroups with different manifestations of SLE (Table 4).
We examined whether previously reported MHC SNPs associated with SLE are also associated with SLE in the BWHS. Reported MHC SNPs were taken from a GWAS by the International Consortium for Systemic Lupus Erythematosus Genetics (SLEGEN) in women of European ancestry (Harley et al. 2008), a GWAS in a Chinese Han population (Han et al. 2009), and a high-density screening of the MHC region in European ancestry subjects (Barcellos et al. 2009) (Table 5). If the reported SNP was not part of our MHC panel, we determined the best proxy in the HapMap YRI population using the SNAP software (Johnson et al. 2008). The SLEGEN GWAS reported the rs3131379 and rs120942 SNPs (Harley et al. 2008), and the Chinese GWAS found 13 SNPs in the MHC region that represented two independent signals in the MHC region at rs9271100 and at rs3997854 (Han et al. 2009). It is noteworthy that in the Chinese GWAS one of the 13 SNPs found in the MHC region was the rs9271366 SNP that was the most significant finding in the present study. Both rs9271100 (the most significant MHC SNP in the Chinese GWAS) and rs9271366 (the most significant MHC SNP in the present study) are just 10 kb apart from each other and they are in the same haplotype block in both HapMap CHB and YRI populations. The high density MHC screening in European ancestry subjects reported 10 independent signals along the MHC region (Barcellos et al. 2009); we were able to find proxies for eight of the reported SNPs. We found that two SNPs from the SLEGEN GWAS, rs3131379 and rs1270942 (tested by the perfect proxy rs440454), and two SNPs from the Chinese GWAS, rs9271100 (tested by the rs9271366, r2 = 0.73) and rs3997854, were associated with increased risk of SLE in the BWHS. One of the SNPs from the Chinese GWAS, the C-allele of the rs3997854 SNP, which was significantly positively associated with SLE in the BWHS, was inversely associated in the Chinese sample. None of the SNPs or their proxies from the MHC high-density screening in European ancestry subjects were associated with risk of SLE in the BWHS.
We also assessed whether the HLA-DRB1*0301 (HLA-DR3) and HLA-DRB1*1501 (HLA-DR2) alleles, which have been associated with higher risk of SLE in European ancestry populations (Hartung et al. 1992; Yao et al. 1993), are also associated with risk of SLE in the BWHS. Although we did not determine HLA serotypes, our MHC panel included tagSNPs that capture variation of the classical HLA types (de Bakker et al. 2006). The haplotype rs3129763_A/rs2647012_T/rs3128968_T tagged the HLA-DRB1*0301 allele (r2=0.71, D′=0.89 in HapMap YRI population), and the rs2116264_G allele tagged the HLA-DRB1*1501 allele (r2=1.0, D′=1.0 in HapMap YRI population). We did not find significant associations of the HLA-DRB1*0301 allele (6.4% in SLE cases vs. 6.4% in controls, p for difference = 0.99) or the HLA-DRB1*1501 allele (1.7% in SLE cases vs. 2.8% in controls, p for difference = 0.11) with SLE in the BWHS.
In the present study we identified four independent SNPs (rs9271366, rs204890, rs2071349, and rs2844580) in the MHC region associated with risk of SLE in a population of African American women. The four identified SNPs extend over a region of 1.7 Mb and they point to different signals that may contribute to SLE risk in African American women.
SNPs rs9271366 (the most significant signal in the present study) and rs2071349 are both located inside the HLA class II region that has been consistently associated with SLE in European, East Asian and African American populations (Fernando et al. 2007; Han et al. 2009; Hom et al. 2008; Reveille et al. 1998; Rioux et al. 2009b; Uribe et al. 2004). It is noteworthy that a recent GWAS found the rs9271366 SNP to be associated with SLE in Chinese Han subjects at genome-wide significance (Han et al. 2009). The rs9271366 was part of a group of genome-wide significant SNPs near the HLA-DRB1 gene that belong to the same haplotype block and seem to be pointing to the same causal signal (Han et al. 2009).
In populations of European ancestry, a major part of the signal from the HLA class II region comes from the presence of haplotypes carrying the HLA-DRB1*0301 and HLA-DRB1*1501 alleles (Graham et al. 2007; Graham et al. 2002). In the present study, both HLA class II alleles (as determined by tagging SNPs) were in low frequency and not associated with SLE, suggesting that the observed associations in the HLA class II region in African American women are not explained by classic class II alleles previously identified as risk factors in populations of European ancestry.
The rs204890 SNP is located in intron 6 of the ATF6B gene that codes for a transcription factor involved in the unfolded protein response pathway during endoplasmic reticulum stress. It is noteworthy that the rs8283 SNP identified in a previous high-density screening of the MHC region (Barcellos et al. 2009) is located in the 3′UTR region of the ATF6B gene. Although our best proxy for rs8283, SNP rs429150, r2=0.94, was not associated with SLE in the present study, both rs204890 and rs8283 are located in the same LD block in the HapMap CEU population suggesting that a single causal variant may be responsible for the rs204890 association in the present study and the rs8283 association in European ancestry subjects.
The rs2844580 SNP is located ~9 kb from the HLA-B gene and ~38 kb from the MHC class I chain-related A (MICA) gene. Genetic variation in both genes has been found to be associated with SLE in European ancestry populations (Gambelunghe et al. 2005; Price et al. 1999; Sanchez et al. 2006; Smerdel-Ramoya et al. 2005), the most consistent being the association with 8.1 ancestral haplotype (AH) (Candore et al. 2002; Price et al. 1999). The 8.1 AH consists of the HLA-A*01, HLA-Cw7, HLA-B*08, MICA-5.1, DRB1*0301, and DQA1*05-DQB1*02 alleles (Ide et al. 2005) and has been associated with different immune diseases in European populations (see review in (Candore et al. 2002)). Because the 8.1 AH is specific to European populations, our findings show that other genetic variants in the HLA-B and MICA neighborhood may contribute to SLE risk.
Our combined analysis suggests that the four newly identified SNPs (rs9271366, rs204890, rs2071349, and rs2844580) are independent signals of SLE risk in African American women. This conclusion is supported by the low linkage disequilibrium values among the four newly identified SNPs. In an analysis summing the risk of the four newly identified SNPs into a genetic risk score, the risk of SLE increased in a linear way according to the number of risk alleles. It is noteworthy that the estimated OR was robust to the use of different sets of SLE cases, suggesting that our results are not being distorted by bias related to survival from SLE or to specificity of case definition.
Comparison with previously published results suggests that some genetic risk factors for SLE may be shared among African Americans and other ethnic groups such as East Asian and European populations. Our most significant result, the rs9271366, was also found to be associated with SLE risk in Chinese subjects (Han et al. 2009); the second SNP from the Chinese GWAS (rs3997854) was also associated with SLE risk in the BWHS although in an opposite direction: the C-allele was protective in Chinese subjects but harmful in the BWHS. Differences in allele frequencies and haplotype background may explain the opposite results for the rs39978534 SNP. The C-allele of the rs39978534 SNP has a frequency of 26% in the BWHS as compared to just 8% in Chinese Han subjects (Han et al. 2009), suggesting that different C-allele carrying haplotypes may be present in the BWHS and they may have different effects on risk of SLE. We also found that the two reported SNPs from the SLEGEN GWAS (Harley et al. 2008), rs3131379 and rs1270942 (tested with the perfect proxy rs440454), were also associated with increased risk of SLE in the present study. However, these two SNPs did not show independent effects in the BWHS after conditioning for the rs9271366 SNP. Although none of the SNPs from the high-density screening of the MHC region (Barcellos et al. 2009) was replicated in the present study, some of the genetic regions were shared with BWHS subjects. For example, the rs204890 SNP reported in the present study and the rs8283 from the high-density screening are located both in the ATF6B gene. Thus, genetic variation in the ATF6B gene seems to affect risk of SLE in African American and European ancestry subjects.
To our knowledge this is the first screening of the MHC region in relationship to SLE in a population-based cohort of African American subjects. Our case-control study was derived from a well-characterized cohort, adding to the strength of the reported results. Control for degree of European ancestry and the case-control matching by region of residence make it unlikely that our results are due to population stratification. The classification of cases of SLE was based on the judgment of the women’s physicians as to the presence of ACR criteria for the illness. While these decisions may have varied across physicians, the fact that our most significant result (the rs9271366 SNP) was also reported in a previous GWAS in Chinese subjects and that we reproduced findings in significant SNPs from the SLEGEN study supports the validity of the BWHS case definition and results. Our study had 80% statistical power to identify ORs of 1.4 or greater for SNPs with minor allele frequencies >10% at MHC-wide significance level. Therefore, it is likely that rare variants or SNPs with smaller effects on the risk of SLE in the MHC region remain to be found.
In summary, our results suggest the presence of at least four independent signals in the MHC region associated with SLE in African American women, and they may be shared in part with other ethnic groups. Taken together, our results and previous GWAS in European and East Asian ancestry populations show that women of different ancestral origins may share some genetic components for the risk of SLE. The identity of the causal variants that are being tagged by the reported SNPs is still unknown. Further studies are needed to narrow the position of the potential causal variants.
This work was supported by grant U01 AI067146 from the National Institute of Allergy and Infectious Diseases, and by grants R01CA058420 and R01CA098663 from the National Cancer Institute. The genotyping for the present project was subsidized by a grant to the Broad Institute Center for Genotyping and Analysis grant U54 RR020278 from the National Center for Research Resources. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institute of Allergy and Infectious Diseases, the National Cancer Institute, or the National Institutes of Health. We thank Dr. Timothy McAlindon and Dr. Elizaveta Vaysbrot for their medical record reviews.