|Home | About | Journals | Submit | Contact Us | Français|
Over 50 regions of the genome have been associated with type 1 diabetes risk, mainly using large case/control collections. In a recent genome-wide association (GWA) study, 18 novel susceptibility loci were identified and replicated, including replication evidence from 2,319 families. Here, we, the Type 1 Diabetes Genetics Consortium (T1DGC), aimed to exclude the possibility that any of the 18 loci were false-positives due to population stratification by significantly increasing the statistical power of our family study.
We genotyped the most disease-predicting single-nucleotide polymorphisms at the 18 susceptibility loci in 3,108 families and used existing genotype data for 2,319 families from the original study, providing 7,013 parent–child trios for analysis. We tested for association using the transmission disequilibrium test.
Seventeen of the 18 susceptibility loci reached nominal levels of significance (p<0.05) in the expanded family collection, with 14q24.1 just falling short (p=0.055). When we allowed for multiple testing, ten of the 17 nominally significant loci reached the required level of significance (p<2.8×10−3). All susceptibility loci had consistent direction of effects with the original study.
The results for the novel GWA study-identified loci are genuine and not due to population stratification. The next step, namely correlation of the most disease-associated genotypes with phenotypes, such as RNA and protein expression analyses for the candidate genes within or near each of the susceptibility regions, can now proceed.
The online version of this article (doi:10.1007/s00125-012-2450-3) contains peer-reviewed but unedited supplementary material, including a full list of members of the Type 1 Diabetes Genetics Consortium, which is available to authorised users.
The publication of the first type 1 diabetes locus found by a genome-wide association (GWA) study in 2006 (IFIH1)  heralded a new era in susceptibility locus discovery in this common autoimmune disease. Over 50 susceptibility loci have now been identified (www.t1dbase.org). Eighteen of these were identified by Barrett et al.  in a GWA meta-analysis of 7,514 cases and 9,045 controls (meta-analysis p<1×10−6) and confirmed in 4,267 cases, 4,670 controls and 2,319 affected sib-pair families (providing 4,342 parent–child trios; replication p<0.01; discovery and replication p<5×10−8) . However, in the family component of the replication samples, eight of the confirmed 18 susceptibility loci failed to reach nominal levels of significance (p<0.05; inferred from the reported 95% confidence intervals for the relative risks and assuming two-sided significance tests). Although replication was based on the combined evidence from case/control and family collections, and no evidence of population stratification in the case/control collection had been found previously [2, 3], family-based evidence, if possible, remains important in order to demonstrate that these associations did not arise through population stratification bias . Such a bias can occur when a single nucleotide polymorphism (SNP) differs in allele frequency across subgroups of the population and risk of disease differs between these subgroups.
Based on the number of case/control and parent–child trio replication samples used in Barrett et al. , if we assume that the parent–child trios equate to an equal number of cases and controls, the power of the case/control and family replication sets would have been similar and the potential impact of winner’s curse (the upward bias of the effect size of the initial finding) on replication would not differ between the replication sample sets. However, in type 1 diabetes, the effects (as measured by relative risk) of non-HLA loci tend to be smaller in affected sib-pair families [2, 5], which are enriched for type 1 diabetes with a higher frequency of high-risk HLA genotypes. Consequently, when the family component of the replication samples used in Barrett et al.  is considered in isolation, the 2,319 affected sib-pair families are likely to have been underpowered (too few samples analysed) to replicate the initial associations. Therefore, in the present study, we genotyped the best disease-predicting SNPs at the 18 susceptibility loci  in an additional 3,108 families (providing 2,801 parent–child trios to the analysis) from the Type 1 Diabetes Genetics Consortium (T1DGC) and the Juvenile Diabetes Research Foundation/Wellcome Trust Diabetes and Inflammation Laboratory. The analyses of these additional families, combined with the original 2,319 families , provided protection from population stratification bias, and increased power to provide further replication support for the associations of these 18 susceptibility loci .
Subjects After the additional genotyping of 3,108 families (2,322 families of white European ancestry and providing at least one parent–child trio; electronic supplementary material [ESM] Table 1), we had a collection of 5,427 families (including 2,319 families previously genotyped ). All families were collected with appropriate informed consent. We analysed 4,429 families of white European ancestry and providing one or more parent–child trios (ESM Table 1).
Genotyping The best disease-predicting SNPs at the 18 susceptibility loci  were genotyped in the additional family samples using the TaqMan 5′ nuclease assay (Applied Biosystems, Warrington, UK) according to the manufacturer’s protocol. Genotyping was performed blind to disease status and double scored to minimise error. Genotype frequencies were tested for deviation from Hardy–Weinberg equilibrium (HWE), and genotype checks were conducted for SNPs that deviated from HWE. We note that disease association can result in deviation from HWE in affected offspring and parents of affected offspring, who are not representative of the general population. The same genotyping technology and protocols had been applied in Barrett et al.  for the replication samples.
Statistical analysis All statistical analyses were performed in either Stata (www.stata.com) or R (www.r-project.org). In R, we used the snpStats package available from the Bioconductor project (www.bioconductor.org), and, in Stata, we used some additional routines available from www-gene.cimr.cam.ac.uk/clayton/software.The family-based power to replicate the 18 type 1 diabetes susceptibility loci  is reported in ESM Table 2. Based on the odds ratios from the case/control component of the replication samples in Barrett et al. , which are not subject to winner’s curse, the expanded family collection is well powered, except for 17q21.2/CCR7 (53.4% power at α=0.05; 17.2% power at α=2.8×10−3, which corresponds to the Bonferroni adjustment of the 0.05 significance level for the 18 independent tests; ESM Table 2). We have greater than 90% power at α=0.05 for 17/18 loci, and greater than 80% power at α=2.8×10−3 for 14/18 loci (17/18 have greater than 60% power at α=2.8×10−3).The best disease-predicting SNPs at the 18 susceptibility loci were analysed using the transmission disequilibrium test, except for the chromosome X locus, rs2664170 Xq28/GAB3, which was analysed using the method proposed by Clayton . As we were attempting to replicate the associations reported in the case/control component of the replication samples analysed in Barrett et al. , we performed one-sided significance tests. We tested for population heterogeneity in SNP genotype frequencies across unaffected parents using Kruskal-Wallis one-way analysis of variance. We tested for population heterogeneity in disease association, after generating pseudo-controls , by testing the addition of the genotype–population interaction term to the conditional logistic regression model of disease status on genotype and population. Parent-of-origin and imprinting effects were tested using the Wallace et al. extension of the Weinberg method [8, 9].
As no p values have been reported previously for the 18 novel susceptibility loci in the family component of the replication samples , we reanalysed the original data. We excluded 312 families because of either non-white European ancestry based on updated sample information or not providing at least one parent–child trio. Seven of the 18 loci failed to reach p<0.05 in these 2,107 families (providing 4,212 parent–child trios; Table 1). In other words, 11 of the 18 loci reach at least nominal levels of significance. If we applied a Bonferroni adjustment for the 18 independent tests, 15 loci failed to reach p<2.8×10−3.
The inclusion of the additional 2,322 families (providing 2,801 trios; 786 families excluded) increased the number of susceptibility loci replicated at p<0.05 from 11 to 17 of the 18 loci. Only ZFP36L1, C14orf181/14q24.1 (p=0.055) failed to reach p<0.05 (Table 1). The number of susceptibility loci replicated at p<2.8×10−3 increased from three to ten (Table 1). Importantly, all of the susceptibility loci had consistent direction of effects with the case/control and family replication samples reported in Barrett et al. , and there was no evidence of heterogeneity in the disease associations across family collections, despite there being significant SNP genotype frequency differences (ESM Table 3). The difference in SNP genotype frequencies across family collections was not surprising given that Europe is a large and diverse collection of countries. For example, we have a large number of families from Finland, a genetically isolated population, which exhibits many and large differences in common SNP allele frequencies.
We tested the 17 autosomal loci for parent-of-origin and imprinting effects; only COBL/7p12.1 showed any evidence of biased maternal transmission, p=1.1×10−3 (ESM Table 4). However, this needs to be replicated in an independent dataset.
In the expanded family collection, only one of the previously confirmed susceptibility loci failed to reach nominal levels of significance, ZFP36L1, C14orf181/14q24.1, as the p value was just above 0.05. All of the susceptibility loci had consistent direction of effects with the case/control component of the replication samples reported in Barrett et al.  (ESM Table 5), and even with our over-conservative threshold for multiple testing, given the very strong prior information that these were true effects , ten loci remained significant after the adjustment for multiple testing. This study clearly demonstrates that additional replication families were required for the 18 susceptibility loci to reach nominal levels of significance and consequently that the previously reported associations (discovery and replication p<5×10−8) with odds ratios often less than 1.15 (; ESM Table 5) did not arise through population stratification bias, thereby further validating the case/control collection (results).
After unequivocal replication of type 1 diabetes loci, the next steps involve dense SNP mapping in even larger sample sets and experiments analysing genotype–phenotype associations. For example, studying correlations between type 1 diabetes SNP risk alleles and haplotypes and expression of genes at the RNA and protein levels  can identify which genes in the associated regions are more likely to be causal. Consequently, genes with both positional and functional evidence for a role in disease aetiology can reveal the pathways and early precursors or biomarkers underlying the pathogenesis of type 1 diabetes.
We gratefully acknowledge the participation of all the patients and family members. We acknowledge use of DNA from the Human Biological Data Interchange and Diabetes UK for the USA and UK multiplex families, respectively, D. Savage of the Belfast Health and Social Care Trust, C. Patterson and D. Carson of Queen’s University Belfast and P. Maxwell of Belfast City Hospital for the Northern Irish families, the Genetics of Type 1 Diabetes in Finland (GET1FIN; J. Tuomilehto, L. Kinnunen, E. Tuomilehto-Wolf, V. Harjutsalo and T. Valle of the National Public Health Institute, Helsinki) for the Finnish families, and C. Guja and C. Ionescu-Tirgoviste of the Institute of Diabetes ‘N Paulescu’, Romania for the Romanian families. This research uses resources provided by the T1DGC, a collaborative clinical study sponsored by the National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK), National Institute of Allergy and Infectious Diseases, National Human Genome Research Institute, National Institute of Child Health and Human Development, and Juvenile Diabetes Research Foundation International (JDRF) and supported by U01 DK062418. We also thank H. Stevens, P. Clarke, G. Coleman, S. Duley, D. Harrison, S. Hawkins, M. Maisuria, T. Mistry and N. Taylor from the JDRF/Wellcome Trust Diabetes and Inflammation Laboratory for preparation of DNA samples and David Clayton from the JDRF/Wellcome Trust Diabetes and Inflammation Laboratory for useful discussions.
Funding This work was funded by the Juvenile Diabetes Research Foundation International, the Wellcome Trust and the National Institute for Health Research Cambridge Biomedical Centre. The Cambridge Institute for Medical Research is in receipt of a Wellcome Trust Strategic Award (079895).
Duality of interest The authors declare that there is no duality of interest associated with this manuscript.
Contribution statement JDC and JMMH conducted analyses and interpreted the data; DS and HS conducted sample handling and genotyping; NMW managed the data; JDC and JAT drafted the article; and all authors contributed to conception and design, revising the article critically for important intellectual content, and gave final approval of the version to be published.
Open Access This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.