|Home | About | Journals | Submit | Contact Us | Français|
HLA class II DRB1 and DQB1 represent the major type 1 diabetes (T1D) genetic susceptibility loci; however, other genes in the HLA region are also involved in T1D risk. We analyzed 1411 pedigrees (2865 affected individuals) from the type 1 diabetes genetics consortium (T1DGC) genotyped for HLA classical loci and for 12 SNPs in the class III region previously shown to be associated with T1D in a subset of 886 pedigrees. Using the transmission disequilibrium test, we compared the proportion of SNP alleles transmitted from within the high risk DR3 and DR4 haplotypes to affected offspring. Markers rs4151659 (mapping to CFB) and rs7762619 (mapping 5′ of LTA) were the most strongly associated with T1D on DR3 (p=1.2 × 10−9 and p=2 × 10−12 respectively) and DR4 (p=4 × 10−15 and p= 8 × 10−8 respectively) haplotypes. They remained significantly associated after stratifying individuals in analyses for B*1801, A*0101-B*0801, DPB1*0301, DPB1*0202, DPB1*0401 or DPB1*0402. Rs7762619 and rs4151659 are in strong linkage disequilibrium (LD) (r2=0.82) with each other, but a joint analysis showed that the association for each SNP was not solely due to LD. Our data support a role for more than one locus in the class III region contributing to risk of T1D.
Type 1 diabetes (T1D) is a form of diabetes mellitus and an autoimmune disease resulting from the immunological destruction of the insulin producing beta cells of the pancreas, which leads to dysregulation of glucose metabolism. T1D has a strong genetic component and approximately 40–50% of the familial clustering of T1D can be attributed to allelic variation in the HLA region (1). The direct involvement of the HLA class II DRB1, DQA1 and DQB1 genes in T1D is well established, with a hierarchy ranging from very predisposing to very protective effects (see 2–3 and references therein). The most highly predisposing DRB1-DQB1 haplotypes are DRB1*0301 DQA1*0501 DQB1*0201 (DR3 for short) and DRB1*0401/2/4/5 DQA1*0301 DQB1*0302 (DR4 for short) with odds ratios > 3.0 for DR3 and > 8.0 for the most predisposing DRB1*04 subtypes (0401 and 0405) (see 3). However, allelic variation at these loci cannot account fully for the pattern of HLA haplotype sharing among affected sib pairs (4). Results from association analyses of the classical HLA loci (class I and DPB1) have shown that other loci are involved in susceptibility to T1D (4–8), with HLA-B demonstrating the strongest association in class I (5–7). Several authors have also reported consistent and reproducible associations of DPB1 with T1D (e.g. 8). Moreover, other polymorphisms within the HLA region yet outside the classical loci have been found to contribute to T1D susceptibility, such as the gene region telomeric to HLA-A associated with T1D after conditioning for DR-DQ, HLA-A, -B and –C (9).
The human MHC class III region is of particular interest. It has been implicated in risk for T1D and the associations observed are not due to linkage disequilibrium (LD) with the DR-DQ genes (10–13). A recent study identified polymorphisms in the AIF-1 gene, mapping to class III, to be significantly associated with T1D risk after conditioning on DRB1, DQB1, HLA-A and HLA-B variation (12). The human MHC class III region comprises a 700 kb sequence located between the centromeric class II (HLA-DRA) and the telomeric class I regions (MICB), and encodes over 60 genes resulting in one of the most gene-dense regions of the human genome (13). These genes include those involved in the activation cascades of the complement system (C2, factor B [BF], C4), hormonal synthesis (steroid 21-hydroxylase, CYP21), inflammation and cell stress (tumour necrosis factor [TNFα], lymphotoxins α and β, 70 kD heat shock proteins (HSPA1A, HSPA1B, HSPA1L), extracellular matrix organization (Tenascin [TNX] as well as immunoglobulin superfamily (Ig-SF) members (1C7, G6f and G6b). The remainder (the majority) of loci are primarily involved in more ‘core’ biological functions with no immediate implication for the immune system. (14–15).
In a previous study of 886 T1D families (17), we applied an overall conditional genotype method to detect associations between variants within the MHC class III region and disease risk after adjusting for the DRB1-DQB1 effect. The null hypothesis of no association assumed that the relative genotype frequencies at marker loci not involved in disease risk are the same in cases and controls on specific DRB1-DQB1 genotypes. This conditional method was applied to the probands from each of the 886 families and the affected family based controls. We tested 356 SNPs within the MHC class III and identified 13 polymorphisms that showed a significant association with T1D risk after adjusting for multiple testing, and conditioning for the DRB1-DQB1 genotype, in both discovery (n=289 families) and replication datasets (n=597 families). Those families represented the set of type 1 diabetes genetics consortium (T1DGC) Caucasian pedigrees with HLA molecular typing available at that time (17).
We hypothesized that if these SNPs were truly implicated in T1D genetic risk, they would affect the risk of the highly predisposing DRB1-DQB1 haplotypes DR3 and DR4 even after matching for DPB1 or for HLA-A and HLA-B class I alleles. The aim of the present study was to assess whether associations we have previously reported in the class III region (17) were reproducible using a different methodological approach and matching for additional classical HLA loci. To test this hypothesis, we analyzed all the complete families from the T1DGC with classical HLA typing. ‘This was possible as high resolution molecular HLA typing became available during 2008 for many more T1DGC samples. The large sample size of this collection (1411 pedigrees, 2865 affected individuals) makes stratification analysis for extended class I-class II haplotypes possible (changed order of words), allowing, in turn, the investigation of additional genetic loci within the MHC class III region that have been suggested by previous studies (10–13).
Using the informative transmissions for the SNPs previously identified by us (15), we assessed whether these polymorphisms significantly affected the risk of predisposing DR3 and DR4 haplotypes to T1D (Figure 1) using both all samples and the subset of samples not previously tested. The strongest association on DR4 haplotypes was observed for rs4151659 (p =4 × 10−15) whereas the strongest association on DR3 haplotypes was observed for rs7762619 (p =4 × 10−12). We decided to concentrate on the role of these two SNPs. First, we explored whether the association was due only to one or a few cohorts, or if there was heterogeneity between cohorts. No evidence for heterogeneity was detected between cohorts with respect to the association of these SNPs on DR3 and DR4 haplotypes (Figure 2). The p-values from meta-analyses are similar to those obtained by pooling all cohorts: p =4 × 10−15 from pooling vs p<5 × 10−18 from meta-analyses and p =4 × 10−12 from pooling vs p= 7 × 10−11 for DR4-rs4151659 and DR3-rs7762619, respectively. Hence, we concluded that pooling data from all cohorts was a valid way of analyzing the association of these SNPs on DR3 and DR4 haplotypes, adjusting for additional classical HLA loci. Alleles within the class II DPB1 gene and the class I HLA-A and HLA-B loci, however, have also been shown to influence risk to T1D (e.g. 4–8), in addition to DR-DQ. We therefore assessed whether the association for rs4151659 or rs7762619 remained present after matching for DPB1, HLA- A or HLA-B alleles on DR3 and DR4 haplotypes. For HLA-A and -B class I genes, we stratified by the following categories when there were at least 20 informative transmissions: A*2401, B*1801, and B*4001 and the highly conserved A*0101-B*0801 on DR3 haplotypes, and A*0201 B*1501 on DR4 haplotypes. For HLA-DPB1, we stratified by the well-known additional effects of: DPB1*0101, *0301, *0201, *0202, *0401, *0402 when there was sufficient sample size. Results show that the presence of minor alleles for both rs4151659 and rs7762619 significantly affect the risk of extended haplotypes including those carrying alleles previously shown to be associated with T1D (Figure 3) (4–8). The increased risk due to the presence of the minor allele of rs4151659 and rs7762619 was not statistically significant on every extended haplotype tested due to the limited number of informative transmissions for some of them, but in every instance the trend was the same. We also included the extended DPB1*0101 DR3 A*0101-B*0801 haplotype. Although the number of transmissions is extremely small (n=10), the association between T1D and both markers was statistically significant (Figure 3). We conclude that the effect of rs4151659 and rs7762619 is not due to these alleles marking extended haplotypes which contain specific alleles at other classical HLA loci. For haplotypes where a large number of informative transmissions were available, e.g. DR4 DPB1*0401, the observed p-values are in the 10−9 range. For less common haplotypes, the p-values are higher, but the same effect is seen consistently, i.e. the minor allele at these two markers is associated with higher risk of T1D among DR3 and DR4 haplotypes regardless of the DPB1 or class I alleles carried.
Rs4151659 and rs7762619 map approximately 400 kilobases apart from each other (Table 2). In spite of this, both SNPs are in high LD, with r2=0.8071 on DR3 haplotypes and r2=0.8397 on DR4 haplotypes. Thus, both SNPs could be marking the same association in spite of the large physical distance between them. To test if this was the case, we carried out a stratified analysis equivalent to that performed above on HLA-B, HLA-A and DPB1. We found that among DR3 haplotypes with the major, low risk T allele at rs7762619, carrying the minor G allele at rs4151659 resulted in significantly higher risk (p<0.004; Table 3). A similar trend that did not reach statistical significance was seen on DR4 (p<0.06). For both DR3 and DR4 haplotypes that carry the major low risk A allele at rs4151659, the presence of the minor allele G at rs7762619 resulted in significantly higher risk of DR4 (p<0.006) and DR3 haplotypes (p<0.02) (Table 3). These data indicate that there are most likely two distinct, albeit correlated, genetic effects on T1D susceptibility marked by rs4151659 and rs7762619.
We assessed whether the association of these two markers was confined only to the highly predisposing DR3 and DR4 haplotypes or if similar effects would be seen on other DRB1-DQB1 haplotypes. We only tested haplotypes for which 30 or more informative transmissions were available. We found strong significant associations overall (Figure 4) and no evidence for heterogeneity for the effect of rs4151659 (I2=0% Q(3 df)= 0.52 n.s.) and rs7762619 (I2=0% Q(3 df)= 0.52 n.s.) among the four haplotypes studied. These data indicate that the predisposing effect of the minor alleles at both rs4151659 and rs7762619 are not limited to DR3 and DR4 haplotypes but are seen more generally, including on neutral/protective DRB1-DQB1 haplotypes. In addition, the deviation from the null observed (Figure 4) in the presence of these haplotypes is comparable to that seen for DR3 and DR4 (Figure 2). Furthermore, in spite of the low MAF of these SNPs, there was a sufficient number of informative transmissions of SNP alleles on other haplotypes to obtain highly statistically significant results with p<1 × 10−7 (Figure 4).
All of the above analyses focused on the difference in transmitted versus not transmitted proportions stratified by classical loci haplotypes. We also investigated the association effect size of rs4151659 and rs7762619 on T1D by conditioning on HLA classical loci haplotypes using the overall conditional haplotype method. The results (Table 4) are similar to those obtained from the previous analyses, although the p-values are in the 10−5–10−8 range. This method also enabled us to compute an estimate of the genetic association effect size which ranged from 2.28 to 3.44.
Furthermore, using the methods by Wacholder and co-workers (18), and Ioannidis’ (19) Bayesian approach and assuming a conservative prior probability of 1 × 10−4, (average alternative effect OR=2.0) we found that the association of rs7762619 had a false positive report probability (FPRP) of 0.001 and a posterior credibility ranging from 0.974 to 986. For the association of rs4151659, the FPRP ranges from 0.001 to 0.178 and posterior credibility from 0.148 to0.94. Thus, overall, the observed associations appear unlikely to be false positives. However, the credibility of the rs4151659 association when conditioning on DRB1-DQB1-HLA-B is much lower than when conditioning for the presence of other loci.
Finally, we explored whether these two markers are in LD with a SNP mapping to the AIF-1 gene in the class III previously reported to be strongly associated with T1D, rs2259571. We found that although both rs4151659 and rs7762619 were in statistically significant LD with rs2259571, they were not highly correlated with it (r2 =0.028 for both), so they are unlikely to be tagging the AIF-1 associated allele.
In this study, we show that two SNP variants within the HLA class III region modify risk for T1D conferred by the high risk DR3 and DR4 haplotypes and appear to also influence the risk conferred by other DRB1-DQB1 haplotypes. Although these two SNPs are highly correlated, we determined that association at these markers may not be exclusively due to LD with each other, suggesting that more than one T1D class III gene is present. Moreover, our results indicate that genetic variation in the class III region contributes to risk of T1D in addition to classical class I and class II variation. These results also demonstrate that utilization of a large, well characterized T1D dataset, such as the one assembled by the T1DGC, makes it possible to dissect various and complex HLA risk components of T1D susceptibility.
One of the markers identified by our study maps 5′ of the lymphotoxin alpha gene (LTA; GeneID:4049) which encodes a highly inducible, secreted, cytokine that mediates a variety of inflammatory, immunostimulatory, and antiviral responses (16). The other marker is a non-synonymous change (K565E) in the complement factor B gene (CFB GeneID:629). Human complement factor B (BF) is an essential component of the alternate complement pathway, and therefore important in innate immune and autoimmune responses (15–16). An association between CFB genetic variation and T1D has, indeed, been reported recently in the North Indian population (20), but unfortunately the genotyping information available in the current study does not allow us to determine which BF allotypes are in LD with rs7762619.
Both markers map to biologically relevant candidates that may be involved in the pathogenesis of T1D. However, because there are other markers in the region that demonstrate very strong LD with one or both of these associated markers (see Table 2 and reference (17)), we cannot exclude the possibility that true causal variants may not be the ones identified by us. An investigation of functional relevance of the genetic associations reported here at both the LTA and CFB and neighboring genes should improve our understanding of the pathogenesis of T1D.
We note several study limitations. The current study focused on DR3 and DR4 haplotypes and only on a limited number of markers which had shown an association, overall, in a previous study. There may be other markers within class III associated with T1D risk. In addition, there may be markers that do not modify the risk of DR3 or DR4, but do modify the risk of e.g. protective haplotypes such as those carrying DQB1*0602 (see e.g. (21)). Furthermore, our study has examined only cases of European descent, and these associations may not generalize to other ethnic groups.
Our findings suggest the involvement of MHC regions in addition to those reported in other studies. An independent study (12), which also used the T1DGC families for replication, identified another class III locus, the AIF-1 gene, as associated with T1D. This could, in part, be due to the different study strategies, i.e., the AIF-1 gene was identified in 434 Norwegian families where the authors tested 53 markers, and subsequently studied the top 3 markers for replication in the T1DGC families. In the current study, we initially examined 356 markers in 889 families, and subsequently studied the top 12 for replication in the whole group of T1DGC families. A large UK study (7) utilizing genotype data from classical HLA loci and SNPs in the MHC region failed to identify any markers in the class III region associated with T1D. This is in contrast with our results and those reported by the Norwegian investigators (12). Differences in marker sets investigated and sample subsets used for discovery may help explain why some variants that were not identified previously, are identified in the current study. There is, in fact, very little overlap among the markers analysed in the current study and those tested by Nejenstev and co-workers (7). Of the class III SNPs tested here, only one - rs1800750 - was included in the large UK study (7). Rs1800750 was found by Nejenstev and co-workers to be nominally associated (p<0.038) with T1D after conditioning for DRB1 and DQB1. These results lend support to our current findings. However, the unconditional association of rs1800750 in the UK study with a considerably smaller sample size compared to the current study has a significance of p<1.09 × 10−7. Using an unconditional TDT for the same SNP in the current set of 1411 pedigrees demonstrates a p<4.88×10−14. The difference in statistical significance between the two studies for the unconditional association results suggests that the conditional effect size of genetic associations at the markers we identified may be smaller in other samples, even if the association is present.
Genetic association results are not always replicated with the same strength in all data sets. It is also possible that different results could be attributed to the use of different methodological approaches. In the present case we have used two different methods and arrived at very similar conclusions. Nevertheless, although failure to find an effect with one analytic approach does not necessarily mean that the effect it is not there, only replications of associations reported here will increase confidence in our findings.
Both our results and those from other research groups (12) strongly suggest that further research on the role of genetic variation in the MHC class III region on MHC susceptibility should be pursued.
The T1DGC is a large worldwide collaborative study of new T1D families that have been collected in a highly standardized fashion from various populations, to aid in the search for additional T1D genes within and outside the HLA region (22). High resolution HLA typing has been performed at eight classical MHC loci by four genotyping centers using standardized typing protocols, reagents and quality control procedures (3) and SNP genotyping for over 1800 polymorphisms in the MHC has been carried out in most of those samples (see below).
The samples used in this analysis are derived from extant T1D family collections as well as from the T1DGC regional network collections. The first comprises the following collections; Danish, Sardinian, and two US collections: Human Biological Data Interchange (HBDI) repository (Philadelphia, PA) and Joslin Diabetes Collection (Boston, MA), which we have merged into a single extant US cohort. The T1DGC collection comprises, among others, over one thousand Caucasian T1D families that were recruited for the T1DGC through four regional networks, Asia-Pacific (Australia, New Zealand and Asia), Europe, North America and the United Kingdom. All collections, both new and extant, were HLA-typed using the T1DGC protocols. A detailed description of the collection protocol is published elsewhere (23). The number of individuals and pedigrees per cohort, as well as descriptive statistics for the study subjects are shown in Table 1. Only individuals of Caucasian origin were included in the current analysis.
The methods of HLA genotyping and statistical analysis have previously been described (3). Briefly, four digit HLA typing was achieved using oligonucleotide probes that were immobilized to a nylon membrane and hybridized with biotin-labeled locus specific PCR products. Genotyping was performed using the Stripscan and Score proprietary software prior to upload to the coordinating center. Regular quality assurance typing performed on blind samples and between laboratories indicates typing agreement to be greater than 98%. Family based data allowed the assignment of five locus haplotypes (DRB1-DQA1-DQB1-DPA1-DPB1) for statistical analyses. Phased genotypes for HLA classical loci were generated using in-house built macros and double checked manually by two independent individuals. Recombinant cases were excluded if the case could be identified; otherwise the whole family was excluded.
High throughput genotyping was carried out by the Wellcome Trust Sanger Institute on the Illumina Golden Gate platform. A detailed description of sample preparation, genotype calling, and quality control procedures is published elsewhere (24). Briefly, the GENCALL score, which reflects the distance of a sample from the centre of its corresponding cluster, was used to flag SNPs or samples with problems. A call rate cut-off of 80% and more than one error per plate were used to flag a SNP as failed. A 50% quantile GENCALL score for each sample indicated sample quality <0.4 and the entire sample was flagged as failed. Mendelian checks were performed using the PEDCHECK software (25), and the family relationship estimation software PREST (26) was used to aid in the determination of family structure problems. Of the 1536 SNPs genotyped, the current analysis refers exclusively to 12 SNPs that map to the MHC class III region and which we reported to be associated with T1D in a two-stage analysis of a smaller subset of samples (17). The SNPs tested in this study are rs1063632, rs7762619, rs1800750, rs2142234, rs9267532, rs400547, rs707915, rs707936, rs4151659, rs4151664, rs17421133, and rs205000. SNP rs2395106 which had been found associated in our previous study (17) was excluded from the present analyses as the call rate when all T1DGC samples, and not merely a subset, were included was <85% The location, minor allele frequency in HapMap (www.hapmap.org) Caucasian samples and in the current sample set, mapping information, and the list of which SNPs of this set are in LD with each other are shown in Table 2.
Under the null hypothesis that within the HLA region no other loci modify the effect on T1D susceptibility of DR3 and DR4 haplotypes, we expect the frequencies of all SNP alleles on these haplotypes to be the same in transmitted and non-transmitted haplotypes (the conditional haplotype method (27–28)); similarly with extension, to include additional predisposition effects from other classical HLA loci: in this case DR-HLA-B, DR-HLA-A or DR-DPB1 haplotypes. If the SNP under study has no effect on T1D risk, the transmission proportions of both alleles should be the same conditioned on the DRB1-DQB1 haplotype. Let DRi-Aj(T) and DRi-Aj(NT) denote the transmitted and non-transmitted counts of haplotype DRi-Aj respectively where Aj denotes allele j at a SNP under investigation. Under the null hypothesis the T/NT ratio= DRi-A1(T) × DRi-A2(NT)/DRi-A2(T) × DRi-A1(NT) should equal 1. This ratio (T/NTR), although not a measure of association in the usual sense, has the same mathematical properties of an odds ratio (OR) so we used it to report the size of the deviation from the null hypothesis.
Both affected sibs from each family were used. Information on the transmission proportion of each allele of each SNP studied within DR3 and DR4 haplotypes and the number of informative transmissions is included in Table 2. Deviations from the random expectation were assessed using a chi-square test where the total sum in the contingency table was ≥200, or a Fisher’s exact test otherwise.
We used the T/NT ratios to test the assumption of heterogeneity for each specified analysis using the method of DerSimonian and Laird based on work first presented by Cochran (29). In the absence of inter-cohort heterogeneity within samples we constructed a Mantel-Haenszel meta-analysis of data from the samples to assess the overall evidence of association. The Mantel-Haenszel chi-squared test and the Mantel-Haenszel estimate of the OR (25) were used to provide a summary test and T/NT ratio. The StatsDirect v 2.7 software (StatsDirect Ltd, Altricham UK) was used.
Control haplotypes (n=967) were determined based on the affected family-based control (AFBAC) method (30). The transmission of overall haplotypes at all four loci (DPA1, DPB1, DRB1, DQA1 DQB1, HLA-B) was used to determine AFBAC haplotypes. Because only the proband from each family is used in the analysis and the AFBAC method uses at most two haplotypes from each pedigree, this approach does not introduce a bias due to the non-independence between sibs.
The expected allele frequencies were computed, given LD and haplotype relative penetrances. Briefly, the null hypothesis (H0) (is that rs4151659 or rs7762619 allele frequencies will differ between patients and controls due to (a) linkage disequilibrium between the SNPs and DRB1-DQB1 (and DRB1-DQB1-HLA-B, DRB1-DQB1-DPB1, DRB1-DQB1-HLA-A) and (b) due to chance (sampling), thus implying that rs4151659 or rs7762619 are neutral relative to disease predisposition.
Under H0 the expected allele frequencies at rs4151659 or rs7762619 alleles can be computed using
the equation derived by Thomson (31):
where Dij denotes the pairwise linkage disequilibrium coefficient between the ith DRB1-DQB1 (HLA-B/HLA-A/DPB1) haplotype and the jth rs4151659 or rs7762619 allele in the control sample, q denotes the allele or haplotype frequency in patients, p denotes the frequency in the affected family based controls and qexp denotes the expected frequency in patients under the assumption of no involvement of the SNP in disease. This method relies on sampling estimates of pairwise linkage disequilibrium between the SNPs and the classical loci haplotypes, and of the proband and control frequencies derived from the samples under study. Thus, there will be a sampling error associated with the computed value for expected SNP alleles. The larger the control sample the smaller this error would be. This has been taken into account in the statistical tests carried out as previously described (6). Given the large number of classical loci haplotypes and the error that very rare haplotypes could introduce, only haplotypes at the susceptibility loci with an average frequency of 0.5% or higher in combined cases and controls were used. Moreover, only HLA+SNP haplotypes where the expected frequency in controls in the absence of LD would be >0.05% were included in the analysis
Two methods were used to estimate the credibility of the genetic associations identified. Both methods rely on the prior probability of a true association of the tested genetic variant with a disease, on the observed P value, and on statistical power to detect the OR of the alternative hypothesis at the given significance level. The first method used was developed by Wacholder and coworkers (18) to compute the probability of no true association between a genetic variant and disease given a statistically significant finding, and is called the false positive report probability (FPRP). The second method used, developed by Ioannidis (19), calculates the Bayes factor under a spike and smear prior using as an alternative, an average genetic effect to estimate a posterior credibility of the associations found.
This research utilizes resources provided by the Type 1 Diabetes Genetics Consortium, a collaborative clinical study sponsored by the National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK), National Institute of Allergy and Infectious Diseases (NIAID), National Human Genome Research Institute (NHGRI), National Institute of Child Health and Human Development (NICHD), and Juvenile Diabetes Research Foundation International (JDRF) and supported by U01 DK062418. This publication was also supported in part by NIH/NIAID contract number HHSN266200400076C, ABD N01-AI-40076 (GT and LFB). Its contents are solely the responsibility of the authors and do not necessarily represent the official views of the NIH. AMV is supported by the FP7 large programme grant 200800 TREAT-OA. We thank Evangelis Evangelou for providing us with the excel spreadsheet to compute the credibility of the genetic associations.
Conflict of Interest
The authors declare no conflict of interest.