|Home | About | Journals | Submit | Contact Us | Français|
OBJECTIVE—Two recent genome-wide association (GWA) studies have revealed novel loci for type 1 diabetes, a common multifactorial disease with a strong genetic component. To fully utilize the GWA data that we had obtained by genotyping 563 type 1 diabetes probands and 1,146 control subjects, as well as 483 case subject–parent trios, using the Illumina HumanHap550 BeadChip, we designed a full stage 2 study to capture other possible association signals.
RESEARCH DESIGN AND METHODS—From our existing datasets, we selected 982 markers with P < 0.05 in both GWA cohorts. Genotyping these in an independent set of 636 nuclear families with 974 affected offspring revealed 75 markers that also had P < 0.05 in this third cohort. Among these, six single nucleotide polymorphisms in five novel loci also had P < 0.05 in the Wellcome Trust Case-Control Consortium dataset and were further tested in 1,303 type 1 diabetes probands from the Diabetes Control and Complications Trial/Epidemiology of Diabetes Interventions and Complications (DCCT/EDIC) plus 1,673 control subjects.
RESULTS—Two markers (rs9976767 and rs3757247) remained significant after adjusting for the number of tests in this last cohort; they reside in UBASH3A (OR 1.16; combined P = 2.33 × 10−8) and BACH2 (1.13; combined P = 1.25 × 10−6).
CONCLUSIONS—Evaluation of a large number of statistical GWA candidates in several independent cohorts has revealed additional loci that are associated with type 1 diabetes. The two genes at these respective loci, UBASH3A and BACH2, are both biologically relevant to autoimmunity.
Type 1 diabetes is a multifactorial disease with a strong genetic component that results from autoimmune destruction of the pancreatic β-cells. The major type 1 diabetes susceptibility locus, mapping to the HLA class II genes at 6p21 (1) and encoding highly polymorphic antigen-presenting proteins, accounts for almost 50% of the genetic risk for type 1 diabetes (2). Several other loci with more modest effects are known, but they do not account for the remaining portion of the risk.
The recent development of high-throughput single nucleotide polymorphism (SNP) genotyping array technologies has enabled us (3) and others (4) to perform high-density genome-wide association (GWA) studies in search of the remaining type 1 diabetes loci. We recently reported the outcome of our GWA for type 1 diabetes in a large pediatric type 1 diabetic cohort of European descent (3); in addition to confirming previously identified loci, we observed highly significant and replicated association with KIAA0350 (now renamed CLEC16A [C-type lectin domain family 16 member A]). Subsequent follow-up of our data also revealed a locus on 12q13 (5). In parallel and independently, the Wellcome Trust Case Control Consortium (WTCCC) (4) also demonstrated replicated (6) association to the same linkage disequlibrium blocks at 16p13 and 12q13, along with two additional loci on 12q24 and 18p11.
The results that we have reported thus far were of loci that achieved statistical significance on the basis of the results of the GWA genotyping (stage 1) or replication in additional cohorts (stage 2) of only a small number of the most promising loci. Here, we describe the results of a full evaluation of all statistical candidates from the GWA phase.
The Canadian cohort consisted of 1,120 nuclear family trios (one affected child and two parents) and 267 independent type 1 diabetes cases, collected in pediatric diabetes clinics in Montreal, Toronto, Ottawa, and Winnipeg. The median age at onset is 8 years with lower and upper quartiles at 4.6 and 11 years, respectively. All patients were diagnosed under the age of 18 years and treated with insulin since diagnosis, and none have stopped treatment for any reason since. Disease diagnosis was based on these clinical criteria rather than any laboratory tests. Ethnic backgrounds were of mixed European descent, with the largest single subset (409 families) being French Canadian. The Research Ethics Board of the Montreal Children's Hospital and other participating centers approved the study, and written informed consent was obtained from all subjects.
The Type 1 Diabetes Genetics Consortium cohort consisted of 549 families (2,350 individuals) with at least two children diagnosed with diabetes and both parents available as of the July 2005 data freeze. Criteria were age at diagnosis below 35 years and uninterrupted treatment with insulin within 6 months of diagnosis. For siblings of probands diagnosed under the age of 35 years, the age-at-diagnosis limit was extended to 45 years if they were lean and had positive islet cell antibodies and/or low C-peptide levels at diagnosis. The median age is 8 years with quartiles at 4 and 13 years. The samples were collected in Europe, North America, and Australia.
The type 1 diabetes cohort consisted of 103 children recruited at the Children's Hospital of Philadelphia (CHOP) since September 2006, as previously described (3).
The Diabetes Control and Complications Trial was a multicenter randomized clinical trial to determine the effect of intensive insulin treatment with respect to reduced development and progression of retinopathy and nephropathy complications in patients with type 1 diabetes (7,8). A total of 1,441 subjects with type 1 diabetes were recruited from 29 centers across North America into the DCCT between 1983 and 1989; they were between 13 and 39 years of age, and 53% were male. They were recruited into two cohorts: the primary prevention cohort consisted of 726 subjects with no retinopathy, an albumin excretion rate <28 μgmin, and diabetes duration of 1–5 years and was studied to determine whether intensive therapy prevented the development of diabetic retinopathy in patients with no retinopathy. The secondary intervention cohort consisted of 715 subjects who had nonproliferative retinopathy, a urinary albumin excretion rate <140 μg/min, and diabetes duration of 1–15 years and was studied to determine whether intensive therapy would affect the progression of early retinopathy (7). Approval for the Diabetes Control and Complications Trial/Epidemiology of Diabetes Complications and Interventions (DCCT/EDIC) genetics study was provided by the Research Ethics Board of the Hospital for Sick Children, Toronto.
The Illumina 1M assay was genotyped on all available probands. To detect and remove outliers due to population stratification from the majority of self-reported white probands, Eigenstrat (9) was used to select probands by sequential analysis. After exclusions of outliers, there were 1,303 DCCT/EDIC probands (695 male and 608 female), with mean ± SD age of type 1 diabetes diagnosis 21 ± 8 years (range 0–38).
The control group used to match with the DCCT/EDIC cases included 2,024 children with self-reported Caucasian ethnicity and mean age 8.82 years (50.83% male and 49.17% female) who did not have diabetes or a first-degree relative with type 1 diabetes. These individuals were recruited by CHOP's clinicians and nursing staff within the CHOP's health care network, including four primary care clinics and several group practices and outpatient practices that included routine check-up visits of healthy children. Of these 2,024 individuals, 1,673 were selected using population-stratification analysis from Eigenstrat similar to that described above for DCCT/EDIC probands (868 male, 801 female, and 4 with ambiguous gender). We removed 351 (17.3%) self-reported European individuals from the control group to address the population heterogeneity. The Research Ethics Board of CHOP approved the study, and written informed consent was obtained from all subjects.
Genotypes for this study were obtained using the Infinium and GoldenGate platforms from Illumina. We performed high-throughput genome-wide SNP genotyping using the Illumina Infinium II HumanHap550 BeadChip technology (Illumina, San Diego) (10,11) at the Center for Applied Genomics at CHOP. We used 750 ng genomic DNA to genotype each sample according to the manufacturer's guidelines. DCCT/EDIC samples were genotyped on the Illumina 1M chip at Illumina (San Diego, CA).
All statistical tests for association were carried out using the software package PLINK (12). The single-marker analysis for the genome-wide data were carried out using a χ2 test on allele-count differences between 563 case and 1,146 control subjects. Odds ratios (ORs) and corresponding 95% CIs were calculated for the association analysis. The transmission disequilibrium test was used to calculate P values on differences between transmitted and untransmitted allele counts in the type 1 diabetic trios and nuclear families. Counts of untransmitted and transmitted alleles from heterozygous parents to affected offspring were determined using the standard transmission disequilibrium test implemented in the Haploview software package (13). The P values from the case-control and family-based analyses in our three discovery cohorts were combined by weighted z scores to quantify the overall evidence for association.
The flow process of this study is shown in Table 1. Comparisons of the statistical power of each population cohort are shown in supplementary Fig. 1 (available in an online appendix at http://dx.doi.org/10.2337/db08-1022). Using our GWA data from 563 Caucasian type 1 diabetes probands and 1,146 control subjects plus 483 type 1 diabetes case-parent trios using the Illumina HumanHap550 BeadChip (3), we identified 982 SNPs outside the major histocompatibility complex region that were suggestive of a potential type 1 diabetes association in the same direction in both cohorts (P < 0.05). We then genotyped these SNPs using the Illumina GoldenGate platform in an independent cohort of 636 nuclear type 1 diabetic families from Canada and the Type 1 Diabetes Genetics Consortium. With the completion of genotyping the third cohort, the WTCCC summary data became available (http://www.wtccc.org.uk) (4). Consequently, we selected markers that met the P < 0.05 threshold both in this third cohort and in the WTCCC dataset (4). Imputation from the Affymetrix data of the WTCCC set was near perfect in all cases (supplementary Table 1). As shown in Table 2, 33 markers met the P < 0.05 threshold across all four cohorts. Although the bulk of them mapped to known loci (PTPN22 , 12q13, KIAA0350 [3,6], IL2RA [15–17], CTLA4 , and IFIH1 ), six SNPs in five loci were completely novel. These were tested in an additional case-control cohort consisting of 1,303 type 1 diabetes probands from the DCCT/EDIC study and an independent dataset of 1,673 control subjects from Philadelphia who had been genotyped on the Illumina 1M and HumanHap550K BeadChips, respectively.
Two signals replicated in this fifth independent cohort (Table 3), and the P values were significant after correction for testing six markers (five independent loci). They map to UBASH3A (ubiquitin-associated and SH3 domain-containing protein A) and BACH2 (broad complex-tramtrack-bric-a-brac [BTB] and cap ‘n’ collar [CNC] homology 2). Table 4 shows that rs9976767 is in fact significant at the genome-wide level when all five cohorts utilized were combined (P = 2.33 × 10−8).
Taken together, our full second-stage approach and combined meta-analysis have revealed additional loci associated with type 1 diabetes. Clearly the risks are relatively modest compared with previously described associations, and it was only with this sample size at our disposal that we could we detect and establish these signals as true positives through an independent validation effort.
UBASH3A is the only gene in its corresponding region of linkage disequilibrium. Mice lacking Sts2 (the mouse homologue for UBASH3A) have been shown to be normal in all respects, including T-cell function (20). Mice lacking both Sts1 and Sts2 do have increased splenocyte numbers and are hyper-responsive to T-cell receptor stimulation. It has been suggested that STS1 and STS2 are critical regulators of the signaling pathways that control T-cell activation (20).
BACH2 is also the only gene at its corresponding region of linkage disequilibrium. The gene product is a member of the small Maf family, which consists of basic region leucine zipper proteins that function either as transcriptional activators or repressors depending on the proteins with which they heterodimerize. Muto et al. (21) found that Bach2−/− mice had relatively high levels of serum IgM but low levels of IgA and IgG subclasses. The Bach2−/− mice have also been reported to present with deficient T-cell–independent and T-cell–dependent IgG responses, leading the authors to conclude that BACH2 was a regulator of the antibody response.
It should also be noted that rs1983853 yielded a nominally significant association with type 1 diabetes in all of the cohorts but did not survive correction for multiple testing in the final validation attempt in the Toronto dataset. This SNP resides in endothelial differentiation gene 7 (EDG7; formerly LPA3), which has been implicated in mechanisms of embryo implantation (22). The SNPs on GLIS3 and RASGRP1 were not validated. They may have been false positives in the earlier stages; alternatively, lack of replication in DCCT/EDIC may be due to different and/or weaker genetic risk determinants in this cohort with late age of onset of type 1 diabetes. This question must be addressed in future studies. The GLI-similar 3 (GLIS3) gene plays important roles in the development of pancreatic β-cells. Mutations in this gene cause a rare syndrome with neonatal diabetes and congenital hypothyroidism (23). The RAS guanyl releasing protein 1 (RASGRP1) gene has important roles in immune regulation, and it has been suggested that it contributes to the autoimmunity of systemic lupus erythematosus (24).
In addition to our findings, what we failed to find deserves comment. In addition to the findings described above, our study confirmed another interesting locus, rs17696736 (C12orf30) at 12q24, reported in the WTCCC study (4,6). Our GWA family cohort suggested type 1 diabetes association with P = 0.011; however, limited by the sample size, our GWA case-control cohort did not show statistical significance (P > 0.05). To validate the type 1 diabetes association, we genotyped rs17696736 using the Sequenom iPLEX assay (Sequenom, Cambridge, MA) in the 1,120 Canadian families and the 549 Type 1 Diabetes Genetics Consortium families. The call rate of rs17696736 genotyping was 99.8%, and no Mendelian error was found. With the family-based association test (25), we confirmed the type 1 diabetes association with P = 8.00 × 10−7, minor G allele frequency 0.452, and OR 1.276. However, given the very thorough coverage of European genetic variation by the Hap550 and the power of our aggregate sample size, it is very unlikely that we missed more than a very small number of common variants with an effect size approaching that of the INS (minor allele frequency 0.2 and OR 0.5; each of our three discovery cohorts has >99.9% power to detect it at α = 0.05 level) or PTPN22 (minor allele frequency 0.1 and OR 1.8; each of our three discovery cohorts has >99.0% power to detect it at α = 0.05 level) loci.
Undoubtedly, larger sample sizes and meta-analysis of all available GWA data will discover an increasing number of loci with decreasing effect sizes, which are unlikely to explain the remaining familial clustering of type 1 diabetes. Such explanation should be sought, it appears, in rare variants, the detection of which is now coming within reach with the use of high-throughput methods for sequencing and for detecting structural variation.
We gratefully acknowledge the use of DNA samples from the Type 1 Diabetes Genetics Consortium, funded by National Institutes of Health Grant U01-DK62418. This work was funded in part by the Juvenile Diabetes Research Foundation (JDRF) International and Genome Canada through the Ontario Genomics Institute. H.Q.Q. is supported by a fellowship from the Canadian Institutes of Health Research. All genotyping and other aspects of the study were funded by an institutional development grant to the Center for Applied Genomics from CHOP. S.F.A.G. and H.H. are funded in part by a JDRF award and a development award from the Cotswold Foundation. This work has received funding from the National Institute of Diabetes and Digestive and Kidney Diseases (N01-DK-6-2204 and R01-DK-077510). A.D.P. holds a Canada Research Chair in the Genetics of Common Diseases.
No potential conflicts of interest relevant to this article were reported.
We thank all the patients, their parents, and the healthy control subjects for their participation in the study.
Published ahead of print at http://diabetes.diabetesjournals.org on 7 October 2008.
S.F.A.G. and H.-Q.Q. contributed equally to this study.
The content of this article is solely the responsibility of the authors and does not necessarily represent the official views of the National Institute of Diabetes and Digestive and Kidney Diseases or the National Institutes of Health.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked “advertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.