Search tips
Search criteria 


Logo of nihpaAbout Author manuscriptsSubmit a manuscriptHHS Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
Diabetologia. Author manuscript; available in PMC 2013 April 29.
Published in final edited form as:
PMCID: PMC3639291

Use of class I and class II HLA loci for predicting age at onset of type 1 diabetes in multiple populations



to assess in multiple populations the role of HLA alleles on early and late age at onset of type 1 diabetes.


Stepwise linear regression models were used to determine which HLA class I and class II risk alleles to include. High resolution genotyping data for patients from the Type 1 Diabetes Genetics Consortium (T1DGC) collection (n=2278) and four independent cohorts from Denmark, Sardinia and the US (HBDI and Joslin) (n=1324) patients (total n=3602) were used to assess the role of HLA variation on age of onset and predict early onset (age ≤5) and late onset (age≥15) of type 1 diabetes.


In addition to carriage of HLA class I alleles A*24:02, B*39:06, B*44:03 and B*18:01, HLA class II DRB1-DQB1 loci significantly contributed to age at onset explaining 3.4% of its variance in the combined data. HLA genotypes with gender were able to predict late onset in all cohorts studied with area under the curve (AUC) values ranging from 0.58 to 0.63. Similar AUC values (0.59–0.70) were obtained for early onset for most cohorts except in the Sardinian study in which none of the models tested had significant predictive power.


HLA associations with age of onset are consistent across most Caucasian populations and HLA information can predict some of the risk of early and late onset of type 1 diabetes. Considerable heterogeneity was observed between Sardinian and other populations particularly with regards to early age of onset.

Keywords: HLA, type 1 diabetes, age at onset, risk prediction, receiver operating characteristics (ROC) curve, area under the curve (AUC), Type 1 Diabetes Genetics Consortium (T1DGC)


Type 1 diabetes is one of the most widely studied complex genetic disorders, and the genes in the human leukocyte antigen (HLA) region are reported to account for approximately 40–50% of the familial aggregation of type 1 diabetes [1]. Age at onset of type 1 diabetes may modify the metabolic phenotype of the patients and may influence the risk of late complications of diabetes. For example, age at onset of type 1 diabetes significantly modifies the long-term risk of proliferative retinopathy. The highest risk for retinopathy is seen in age-at-onset group 5–14 years, whereas the lowest risk is in age-at-onset group 15–40 years [2]. Similarly patients with onset of diabetes after age 15 have been observed to have a lower risk of diabetic nephropathy and end stage renal disease than do patients diagnosed during adolescence [3]. On the other hand recent studies indicate higher mortality for type 1 diabetes patients diagnosed in late adolescence or adulthood than for patients diagnosed earlier [4].

A significant genetic component for age at onset has been reported, in particular, a contribution by specific human leukocyte antigen (HLA) alleles [57]. However, previous studies investigating the role of HLA alleles on age of onset have all come from a single cohort and have analysed sample sizes of only a few hundred patients at a time [57]. The aim of this study was to investigate HLA class I and class II classical loci genotyped in a large collection of patients from the T1DGC collection to assess their effect on age at onset of type 1 diabetes. We have studied different populations of European descent focusing on the role of specific DRB1-DQB1 genotypes, DPB1 and HLA class I alleles that have been previously implicated in risk of type 1 diabetes or on age at diagnosis. We have compared genetic prediction risk models for early age at onset (age <5) and for late age at onset (≥15).

Subjects and Methods

Study subjects

The T1DGC is a large, worldwide, collaborative study aimed at collecting and genotyping new type 1 diabetes families in a highly standardized fashion from multiple populations, to aid in the search for additional type 1 diabetes genes within and outside the HLA region [8]. An individual was designated as affected if he or she had documented type 1 diabetes with onset at ≤37 years of age, had used insulin within 6 months of diagnosis, and had no concomitant disease or disorder associated with diabetes. High resolution HLA genotyping was performed at eight classical MHC loci by four genotyping centers using standardized typing protocols, reagents, and quality control procedures [9]. In addition to the patient clinical samples collected by the T1DGC, genotyping was also carried out in existing clinical collections. Age at onset and high resolution genotyping data were also available for samples and data collected outside of the T1DGC framework and contributed for inclusion in various T1DGC projects; including the Danish, HBDI, Joslin and Sardinian collections

Proband status

For the T1DGC collection, the proband was identified as the first child diagnosed with type 1 diabetes in the family. The `proband' variable within the data set identifies the first child diagnosed with type 1 diabetes. For the existing cohorts, the criteria for proband assignment was not readily available for all pedigrees.

Allele selection and genotype coding

The genetic contribution of DRB1-DQB1 genotypes was encoded as DR3/DR4=4, DR3/DR3=3 DR4/DR4=2, DR4/DRx, DR3/DRx=1, DRx/DRx=0. Where DR3=DRB1*03:01-DQB1*02:01, DR4= DRB1*04:01/2/4/5/8/13-DQB1*03:02 or 03:04 or 02:01 and x is any other haplotype including DRB1*04:03 or other DRB1*04 carrying haplotypes with DQB1*03:01. Genotypes that included the highly protective allele DQB1*06:02, the haplotype DRB1*14:01 DQB1*05:03 or DRB1*07:01 DQB1*03:03 were categorized as DRx/DRx. This ranking was based on previous reports of predisposing, protective and neutral DR-DQ haplotypes [15]. HLA alleles at loci other than DRB1-DQB1 that have been convincingly implicated in risk of type 1 diabetes were also included in the model. These included DPB1 alleles 02:02, 03:01 and 04:02 and class I alleles A*24:02, A*02:01, A*11:01, A*30:02, A*32:01, A*66:01, B*18:01, B*35:02, B*57:01, C*03:03, B*39:06, B*44:03, C*07:02 [6,1011]

Risk factor selection

Stepwise linear regressions were carried out using age at onset as the outcome continuous variable including all the above genetic factors in addition to gender, cohort of origin and proband status.

Outcome variables

Having identified which HLA variables to include, logistic regressions were carried out in each cohort separately, adjusting for proband status and including the HLA alleles that were found to influence age at onset in the step-wise linear regression analyses. Two binary outcome variables were defined: (1) “early age at onset”coded as 1 if age at onset ≤5 (28.6% of patients) or 0 if age at onset >5 (71.4% of patients), (2) “late age at onset” coded as 1 if age at onset >15 (21.5% of patients) or 0 if age at onset ≤15 (78.5% of patients). These age cut-offs were defined based on the ages at which differences in rates of complications and mortality have been observed [34].

Inter study heterogeneity

Inter-study hetereogeneity was assessed using a DerSimonian Laird random effects meta-analysis and computing the heterogeneity variance τ. The rmeta library in R was used (

Calibration and discrimination

The predictive power of a given diagnostic is usually summarized by a receiver operating characteristic (ROC) curve. In this type of analysis, subjects are ranked in descending order of their predicted risk and the cumulative proportion of subjects who develop disease (cases) is plotted against the corresponding cumulative proportion of the population, i.e., the sensitivity (true positive fraction) is plotted in the y-axis vs 1- specificity (the false negative fraction) in the x-axis [12]. A perfect diagnostic would be represented by a line that starts at the origin, travels up the y axis to 1 and then across the origin to an x-axis value of 1, thus having a total area under the curve (AUC) of 1. A test with AUC=0.5 on the other hand has zero diagnostic value. Whereas discrimination examines the ability to correctly classify subjects into different groups, calibration assesses how closely the predicted probabilities reflect actual risk [12]. Calibration and discrimination abilities of the models were examined in the independent cohorts described above.

A risk score was calculated for each individual using the logit equation:

equation M1

Where p is the probability of the outcome (early or late age at onset), α is the constant and β is the natural logarithm value of the odds ratio for a specific predictor Xi.

The logit operator maintains the linearity of the model and allows the calculation of a probability of the outcome (in this case early or late age at onset) given the different sets of predictors, according to p= exp(logit)/ (1+exp(logit)). Thus the higher the risk score, the greater the risk of the outcome.

The individuals were classified into different sub-groups according to the risk scores. Observed and predicted frequencies of the disease in subgroups were calculated. The Hosmer–Lemeshow χ2 statistics for goodness-of-fit were used for calibration to compare observed and predicted risk [13]. Non significant p-values for this test indicate good calibration. Both discrimination and calibration of risk models were carried out using the PredictABEL package for R (


The mean and standard deviation of age at onset by gender and proband status for each of the cohorts are summarized in Table 1. Genotyping and age at onset information for a total of 3602 type 1 diabetes patients corresponding to 1801 affected sib pairs from the T1DGC, including the extant collection, were included. The overall range for age at onset was 0–37 years.

Table 1
Age at onset and gender distribution by cohort of the patients included in the study.

Age at onset was found to be significantly higher in males (p<0.005) than in females. In the T1DGC collection, where the proband was defined as the first child to develop type 1 diabetes, a strong difference in age at onset is seen between probands and non probands (Table 1). In the T1DGC collection and some of the pre-existing collections probands status appears to be a confounding variable for younger onset.

A stepwise logistic regression was carried out, which included DPB1 alleles 02:02, 03:01 and 04:02, class I alleles A*24:02, A*02:01, A*11:01, A*32:01, A*66:01, B*18:01, B*35:02, B*57:01, C*03:03, B*39:06, B*44:03, and C*07:02, adjusting for gender, cohort of origin, and proband status. This analysis revealed that, for HLA, only DR-DQ genotype, A*24:02, and B*39:06 contributed significantly to age of type 1 diabetes onset, with B*44:03 and B*18:01 nearing statistical significance (Table 2).

Table 2
Stepwise linear regression of factors associated with type 1 diabetes age at onset 1801 sib pairs

Individual Population Effects

The allele/genotype frequencies for DR3/4, A*24:02, B*18:01, B*39:06, and B*4403 are shown in Table 3. The frequencies are stratified by early age at onset (age<=5) versus not, by late age at onset (age>=15) versus not and by early onset (age<=5) versus late onset (age>=15). Differences in allele and genotype frequencies are seen among the various populations, as is expected for the genes in the HLA region. The effect of these alleles and genotypes on early and late onset was assessed by multiple logistic regression including all genetic variables in the model in addition to gender and probands status. We observe a striking difference in the frequency of B*18:01 among Sardinian patients compared to the other groups (Table 3). A much higher frequency of certain DR3 haplotypes in this population, compared to other European populations, has already been reported [14].

Table 3
Comparison of HLA alleles associated with age of onset and their association with early and late age of onset in different populations. Odds ratios were derived from a logistic regression model including all genetic variables as well as gender and proband ...

Using random effects meta-analysis we investigated whether there was evidence of statistically significant heterogeneity between study cohorts, i.e. whether, regardless of the frequency, the effect on age of onset was different for each of the five genetic variables studied. We found no evidence of inter-study heterogeneity for any of the early onset genetic effects nor for any of the early versus late onset associations with the smallest p-value for heterogeneity being p=0.20. From both of these traits (early versus other and early versus late), meta-analyses of the genetic effects of B*18:01 and B*44:03 did not reach statistical significance; all other associations were statistically significant overall. For late age of onset we observed evidence for inter-study heterogeneity for the effect of B*39:06 on this outcome yielding τ=0.086 p=0.05. By meta-analysis the only genetic effects that were significantly associated with late age of onset were the DR-DQ genotype and A*24:02, indicating that these are the most consistent effects throughout the age of onset distribution. In the absence of significant heterogeneity within the T1DGC sub-cohorts we have merged them for the risk prediction analysis.

We then assessed whether these HLA markers could predict early or late age at onset

Risk Prediction Models

A logistic regression on early onset and late onset outcomes was then fitted using three different models for each outcome: (1) gender as the only risk factor (2) HLA as the only risk factor (3) gender and HLA as risk factors. Early versus late models were not fitted given the small sample sizes involved.

The following models were fitted:

  1. Early age at onset
    • 1)
      gender only
      Logit= −1.00254+0.07075 gender
    • 2)
      HLA only
      Logit=−1.611245+ 0.12472 B*18:01 +0.855414 B*39:06 −0.3341 B*44:03 + 0.2947 A*24:02 +0.1869 DR-DQ
    • 3)
      HLA + gender
      Logit= −1.91141+ 0.096 gender+ 0.1225 B*18:01 +0.29309 A*24:02 +0.8556 B*39:06 − 0.3394 B*44:03 + 0.18356*DR-DQ
  2. Late age at onset
    • 1)
      gender only
      Logit= −1.3957−0.1790 gender
    • 2)
      HLA only
      Logit= −0.8281− 0.008191 B*18:01 −0.60403 B*39:06 +0.11133 B*44:03 −0.4537 A*24:02 +0.18383 DR-DQ
    • 3)
      HLA + gender
      Logit=−0.6588 −0.207 proband status − 0.207 gender − 0.0478*B*18:01 −0.5087A*24:02 −0.4058B*39:06 + 0.2339B*44:03 − 0.1621*DR-DQ

The risk discrimination and calibration results from these models in all cohorts are shown in Table 4.

Table 4
Validation of the risk prediction models for type 1 diabetes age at onset using HLA information.

The best prediction for early onset was seen in the JOS cohort for an HLA only model yielding an area under the curve (AUC) = 0.700 (Table 4). For late onset, the best AUC was seen in the DAN cohort for a model including both gender and HLA (AUC=0.644). For all other cohorts and outcomes except one, at least one of the models had an AUC value significantly higher than a value of 0.5 at which the test would have no predictive value. None of the risk prediction models is significantly different from 0.5 in the Sardinian cohorts.

For late onset the models that included gender and HLA had calibration problems in two of the cohorts because of the heterogenous relationship between gender and age of onset among cohorts. We also note that the large confidence intervals for AUC in some of the cohorts is likely due to the smaller sample size available in those studies.


In the current study we have investigated the role of HLA high resolution genotypes on age at T1D onset in various populations of European descent. To our knowledge this is the first study to compare the role of HLA on age of onset in different populations.

We further investigated whether such genotyping information could have any predictive value for assessing the risk of very young age at onset in contrast to late onset for type 1 diabetes. Using the largest data set to date to address this question we found that the strongest genetic contribution to age of onset appears to come from the DRB1-DQB1 genotypes, which also have the strongest influence on disease risk [15]. In addition a few select class I alleles, notably A*24:02, B*18:01, B*39:06, and B*44:03 also influence age at onset. Of these the most consistent effect is that of A*24:02, whereas the other class I alleles either do not influence specific cut-offs of age of onset (early versus late) or show evidence of strong heterogeneity across populations (e.g. B*39:06 for late age of onset).

In both the T1DGC collection and other independent extant collections, we find that genotypes for classical class I and class II HLA can have some modest predictive power for these two outcomes. On the one hand, this confirms the role of HLA polymorphism in influencing age at onset. On the other hand, it highlights that other risk factors not included in our models must also be influencing age at type 1 diabetes onset.

Current approaches for the prediction of type 1 diabetes in screening studies take advantage of the major genetic risk factors, genotyping for HLA-DR and HLA-DQ loci and screening for autoantibodies directed against islet-cell antigens [16]. For example, children who carry both of the highest-risk HLA haplotypes (DR3/DR4–DQB1*03:02) have a risk of approximately 1 in 20 for a diagnosis of T1D by the age of 15 years [16]. The results presented here may help improve such models by taking into account also the role of genetic risk factors on age at onset.

We found that gender had little or no predictive value and that because the relationship with age of onset was not consistent across cohorts in most instances it did not improve the AUC. For the DAN and HBDI where the difference in age of onset between genders was strongest, the inclusion of gender did show a slight improvement but not in an additive way. This consistent with what has been reported for the combination of genetic and non genetic factors for other disease areas [17].

We note several study limitations. Our analyses have used data derived from affected sib pair cohorts of European descent, selecting for patients with a strong genetic contribution to type 1 diabetes and therefore possibly also to its age at diagnosis. The current data are thus reflective of the prediction of HLA in a group of patients enriched for genetic risk. On the other hand, these data are relevant to clinical research, as studies of first degree relatives (follow-up, prevention trials) involve those who have a family member [16, 18] already diagnosed with T1D and genetic factors combined with other factors could be applied to the analysis of data from cohorts of relatives. In addition, these results highlight the differences between European descent populations and illustrate the limits and the extent to which HLA may be helpful in predicting age at disease onset.

We have developed and calibrated three risk prediction models for age at early and late onset of type 1 diabetes, based on five independent patient collections. We hope that these models may be used as pilots to lead further research in defining risk prediction for age at onset using other risk factors (e.g., environmental exposures, autoantibodies). The models may be applied at the individual level to predict the most likely category of age at onset (early or late), but also at the population level, with reference to other relative risks from published studies, to estimate the potential population risk reduction that may be gained by primary prevention of any modifiable risk factors that influence type 1 diabetes and the ensuing complications.


This research utilizes resources provided by the Type 1 Diabetes Genetics Consortium, a collaborative clinical study sponsored by the National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK), National Institute of Allergy and Infectious Diseases (NIAID), National Human Genome Research Institute (NHGRI), National Institute of Child Health and Human Development (NICHD), and Juvenile Diabetes Research Foundation International (JDRF) and supported by U01 DK062418. This work was also supported by R01 DK61722 (J. A. N.).


Author Contributions: H.A.E. contributed to the study conception, data collection, revised the article for crtically important content and gave final approval to the version to be published, J.C. contributed to the study conception, data collection, revised the article for crtically important content and gave final approval to the version to be published, M.V. contributed to the study conception, data collection, revised the article for crtically important content and gave final approval to the version to be published, P.V.M contributed to the study conception, data collection, revised the article for crtically important content and gave final approval to the version to be published, J.An. . contributed to the study conception, data collection, revised the article for crtically important content and gave final approval to the version to be published, A.M.V. contributed to the study conception, data collection, data analyses, wrote the manuscript and gave final approval to the version to be published.

Duality of interest statement: The authors declare that there is no duality of interest associated with this manuscript.


1. Noble JA, Valdes AM. Genetics of the HLA Region in the Prediction of Type 1 Diabetes. Curr Diab Rep. 2011 Dec;11(6):533–42. [PMC free article] [PubMed]
2. Hietala K, Harjutsalo V, Forsblom C, Summanen P, Groop PH. FinnDiane Study Group. Age at onset and the risk of proliferative retinopathy in type 1 diabetes. Diabetes Care. 2010 Jun;33(6):1315–9. [PMC free article] [PubMed]
3. Finne P, Reunanen A, Stenman S, Groop PH, Grönhagen-Riska C. Incidence of end-stage renal disease in patients with type 1 diabetes. JAMA. 2005;294:1782. [PubMed]
4. Harjutsalo V, Maric C, Forsblom C, Thorn L, Wadén J, Groop PH. Sex-related differences in the long-term risk of microvascular complications by of type 1 diabetes. Diabetologia. 2011;54:1992–9. [PubMed]
5. Valdes AM, Thomson G, Erlich HA, Noble JA. Association between type 1 diabetes age of onset and HLA among sibling pairs. Diabetes. 1999;48(8):1658–61. [PubMed]
6. Valdes AM, Erlich HA, Noble JA. Human leukocyte antigen class I B and C loci contribute to Type 1 Diabetes (T1D) susceptibility and age at T1D onset. Hum Immunol. 2005 Mar;66(3):301–13. [PubMed]
7. Tait BD, Harrison LC, Drummond BP, Stewart V, Varney MD, Honeyman MC. HLA antigens and age at diagnosis of insulin-dependent diabetes mellitus. Hum Immunol. 1995 Feb;42(2):116–22. [PubMed]
8. Barrett JC, Clayton DG, Concannon P, et al. Genome-wide association study and meta-analysis find that over 40 loci affect risk of type 1 diabetes. Nat Genet. 2009;41:703–7. [PMC free article] [PubMed]
9. Mychaleckyj JC, Noble JA, Moonsamy PV, et al. HLA genotyping in the international Type 1 Diabetes Genetics Consortium. Clinical trials (London, England) 2010;7:S75–87. [PMC free article] [PubMed]
10. Varney MD, Valdes AM, Carlson JA, et al. HLA DPA1, DPB1 alleles and haplotypes contribute to the risk associated with type 1 diabetes: analysis of the type 1 diabetes genetics consortium families. Diabetes. 2010;59(8):2055–62. [PMC free article] [PubMed]
11. Noble JA, Valdes AM, Varney MD, et al. HLA class I and genetic susceptibility to type 1 diabetes: results from the Type 1 Diabetes Genetics Consortium. Diabetes. 2010;59(11):2972–9. [PMC free article] [PubMed]
12. Altman DG, Bland JM. Diagnostic tests 3: receiver operating characteristic plots. BMJ. 1994;309:188. [PMC free article] [PubMed]
13. Hosmer DW, Lemeshow S. Applied Logistic Regression. Wiley; New York: 2000.
14. Koeleman BP, Herr MH, Zavattari P, et al. Conditional ETDT analysis of the human leukocyte antigen region in type 1 diabetes. Ann Hum Genet. 2000 May;64(Pt 3):215–21. [PubMed]
15. Erlich H, Valdes AM, Noble J, et al. HLA DR-DQ haplotypes and genotypes and type 1 diabetes risk: analysis of the type 1 diabetes genetics consortium families. Diabetes. 2008 Apr;57(4):1084–92. [PubMed]
16. Aly TA, Ide A, Humphrey K, et al. Genetic prediction of autoimmunity: initial oligogenic prediction of anti-islet autoimmunity amongst DR3/DR4-DQ8 relatives of patients with type 1A diabetes. J Autoimmun. 2005;25(Suppl):40–45. [PubMed]
17. Janssens AC, Ioannidis JP, van Duijn CM, Little J, Khoury MJ, GRIPS Group Strengthening the reporting of genetic risk prediction studies: the GRIPS statement. Eur J Clin Invest. 2011;41(9):1004–9. [PubMed]
18. Aly TA, Ide A, Jahromi MM, et al. Extreme genetic risk for type 1A diabetes. Proc Natl Acad Sci U S A. 2006;103:14074–14079. [PubMed]