|Home | About | Journals | Submit | Contact Us | Français|
to assess in multiple populations the role of HLA alleles on early and late age at onset of type 1 diabetes.
Stepwise linear regression models were used to determine which HLA class I and class II risk alleles to include. High resolution genotyping data for patients from the Type 1 Diabetes Genetics Consortium (T1DGC) collection (n=2278) and four independent cohorts from Denmark, Sardinia and the US (HBDI and Joslin) (n=1324) patients (total n=3602) were used to assess the role of HLA variation on age of onset and predict early onset (age ≤5) and late onset (age≥15) of type 1 diabetes.
In addition to carriage of HLA class I alleles A*24:02, B*39:06, B*44:03 and B*18:01, HLA class II DRB1-DQB1 loci significantly contributed to age at onset explaining 3.4% of its variance in the combined data. HLA genotypes with gender were able to predict late onset in all cohorts studied with area under the curve (AUC) values ranging from 0.58 to 0.63. Similar AUC values (0.59–0.70) were obtained for early onset for most cohorts except in the Sardinian study in which none of the models tested had significant predictive power.
HLA associations with age of onset are consistent across most Caucasian populations and HLA information can predict some of the risk of early and late onset of type 1 diabetes. Considerable heterogeneity was observed between Sardinian and other populations particularly with regards to early age of onset.
Type 1 diabetes is one of the most widely studied complex genetic disorders, and the genes in the human leukocyte antigen (HLA) region are reported to account for approximately 40–50% of the familial aggregation of type 1 diabetes . Age at onset of type 1 diabetes may modify the metabolic phenotype of the patients and may influence the risk of late complications of diabetes. For example, age at onset of type 1 diabetes significantly modifies the long-term risk of proliferative retinopathy. The highest risk for retinopathy is seen in age-at-onset group 5–14 years, whereas the lowest risk is in age-at-onset group 15–40 years . Similarly patients with onset of diabetes after age 15 have been observed to have a lower risk of diabetic nephropathy and end stage renal disease than do patients diagnosed during adolescence . On the other hand recent studies indicate higher mortality for type 1 diabetes patients diagnosed in late adolescence or adulthood than for patients diagnosed earlier .
A significant genetic component for age at onset has been reported, in particular, a contribution by specific human leukocyte antigen (HLA) alleles [5–7]. However, previous studies investigating the role of HLA alleles on age of onset have all come from a single cohort and have analysed sample sizes of only a few hundred patients at a time [5–7]. The aim of this study was to investigate HLA class I and class II classical loci genotyped in a large collection of patients from the T1DGC collection to assess their effect on age at onset of type 1 diabetes. We have studied different populations of European descent focusing on the role of specific DRB1-DQB1 genotypes, DPB1 and HLA class I alleles that have been previously implicated in risk of type 1 diabetes or on age at diagnosis. We have compared genetic prediction risk models for early age at onset (age <5) and for late age at onset (≥15).
The T1DGC is a large, worldwide, collaborative study aimed at collecting and genotyping new type 1 diabetes families in a highly standardized fashion from multiple populations, to aid in the search for additional type 1 diabetes genes within and outside the HLA region . An individual was designated as affected if he or she had documented type 1 diabetes with onset at ≤37 years of age, had used insulin within 6 months of diagnosis, and had no concomitant disease or disorder associated with diabetes. High resolution HLA genotyping was performed at eight classical MHC loci by four genotyping centers using standardized typing protocols, reagents, and quality control procedures . In addition to the patient clinical samples collected by the T1DGC, genotyping was also carried out in existing clinical collections. Age at onset and high resolution genotyping data were also available for samples and data collected outside of the T1DGC framework and contributed for inclusion in various T1DGC projects; including the Danish, HBDI, Joslin and Sardinian collections
For the T1DGC collection, the proband was identified as the first child diagnosed with type 1 diabetes in the family. The `proband' variable within the data set identifies the first child diagnosed with type 1 diabetes. For the existing cohorts, the criteria for proband assignment was not readily available for all pedigrees.
The genetic contribution of DRB1-DQB1 genotypes was encoded as DR3/DR4=4, DR3/DR3=3 DR4/DR4=2, DR4/DRx, DR3/DRx=1, DRx/DRx=0. Where DR3=DRB1*03:01-DQB1*02:01, DR4= DRB1*04:01/2/4/5/8/13-DQB1*03:02 or 03:04 or 02:01 and x is any other haplotype including DRB1*04:03 or other DRB1*04 carrying haplotypes with DQB1*03:01. Genotypes that included the highly protective allele DQB1*06:02, the haplotype DRB1*14:01 DQB1*05:03 or DRB1*07:01 DQB1*03:03 were categorized as DRx/DRx. This ranking was based on previous reports of predisposing, protective and neutral DR-DQ haplotypes . HLA alleles at loci other than DRB1-DQB1 that have been convincingly implicated in risk of type 1 diabetes were also included in the model. These included DPB1 alleles 02:02, 03:01 and 04:02 and class I alleles A*24:02, A*02:01, A*11:01, A*30:02, A*32:01, A*66:01, B*18:01, B*35:02, B*57:01, C*03:03, B*39:06, B*44:03, C*07:02 [6,10–11]
Stepwise linear regressions were carried out using age at onset as the outcome continuous variable including all the above genetic factors in addition to gender, cohort of origin and proband status.
Having identified which HLA variables to include, logistic regressions were carried out in each cohort separately, adjusting for proband status and including the HLA alleles that were found to influence age at onset in the step-wise linear regression analyses. Two binary outcome variables were defined: (1) “early age at onset”coded as 1 if age at onset ≤5 (28.6% of patients) or 0 if age at onset >5 (71.4% of patients), (2) “late age at onset” coded as 1 if age at onset >15 (21.5% of patients) or 0 if age at onset ≤15 (78.5% of patients). These age cut-offs were defined based on the ages at which differences in rates of complications and mortality have been observed [3–4].
Inter-study hetereogeneity was assessed using a DerSimonian Laird random effects meta-analysis and computing the heterogeneity variance τ. The rmeta library in R was used (http://cran.r-project.org/web/packages/rmeta/rmeta.pdf).
The predictive power of a given diagnostic is usually summarized by a receiver operating characteristic (ROC) curve. In this type of analysis, subjects are ranked in descending order of their predicted risk and the cumulative proportion of subjects who develop disease (cases) is plotted against the corresponding cumulative proportion of the population, i.e., the sensitivity (true positive fraction) is plotted in the y-axis vs 1- specificity (the false negative fraction) in the x-axis . A perfect diagnostic would be represented by a line that starts at the origin, travels up the y axis to 1 and then across the origin to an x-axis value of 1, thus having a total area under the curve (AUC) of 1. A test with AUC=0.5 on the other hand has zero diagnostic value. Whereas discrimination examines the ability to correctly classify subjects into different groups, calibration assesses how closely the predicted probabilities reflect actual risk . Calibration and discrimination abilities of the models were examined in the independent cohorts described above.
A risk score was calculated for each individual using the logit equation:
Where p is the probability of the outcome (early or late age at onset), α is the constant and β is the natural logarithm value of the odds ratio for a specific predictor Xi.
The logit operator maintains the linearity of the model and allows the calculation of a probability of the outcome (in this case early or late age at onset) given the different sets of predictors, according to p= exp(logit)/ (1+exp(logit)). Thus the higher the risk score, the greater the risk of the outcome.
The individuals were classified into different sub-groups according to the risk scores. Observed and predicted frequencies of the disease in subgroups were calculated. The Hosmer–Lemeshow χ2 statistics for goodness-of-fit were used for calibration to compare observed and predicted risk . Non significant p-values for this test indicate good calibration. Both discrimination and calibration of risk models were carried out using the PredictABEL package for R (http://cran.r-project.org/web/packages/PredictABEL/index.html).
The mean and standard deviation of age at onset by gender and proband status for each of the cohorts are summarized in Table 1. Genotyping and age at onset information for a total of 3602 type 1 diabetes patients corresponding to 1801 affected sib pairs from the T1DGC, including the extant collection, were included. The overall range for age at onset was 0–37 years.
Age at onset was found to be significantly higher in males (p<0.005) than in females. In the T1DGC collection, where the proband was defined as the first child to develop type 1 diabetes, a strong difference in age at onset is seen between probands and non probands (Table 1). In the T1DGC collection and some of the pre-existing collections probands status appears to be a confounding variable for younger onset.
A stepwise logistic regression was carried out, which included DPB1 alleles 02:02, 03:01 and 04:02, class I alleles A*24:02, A*02:01, A*11:01, A*32:01, A*66:01, B*18:01, B*35:02, B*57:01, C*03:03, B*39:06, B*44:03, and C*07:02, adjusting for gender, cohort of origin, and proband status. This analysis revealed that, for HLA, only DR-DQ genotype, A*24:02, and B*39:06 contributed significantly to age of type 1 diabetes onset, with B*44:03 and B*18:01 nearing statistical significance (Table 2).
The allele/genotype frequencies for DR3/4, A*24:02, B*18:01, B*39:06, and B*4403 are shown in Table 3. The frequencies are stratified by early age at onset (age<=5) versus not, by late age at onset (age>=15) versus not and by early onset (age<=5) versus late onset (age>=15). Differences in allele and genotype frequencies are seen among the various populations, as is expected for the genes in the HLA region. The effect of these alleles and genotypes on early and late onset was assessed by multiple logistic regression including all genetic variables in the model in addition to gender and probands status. We observe a striking difference in the frequency of B*18:01 among Sardinian patients compared to the other groups (Table 3). A much higher frequency of certain DR3 haplotypes in this population, compared to other European populations, has already been reported .
Using random effects meta-analysis we investigated whether there was evidence of statistically significant heterogeneity between study cohorts, i.e. whether, regardless of the frequency, the effect on age of onset was different for each of the five genetic variables studied. We found no evidence of inter-study heterogeneity for any of the early onset genetic effects nor for any of the early versus late onset associations with the smallest p-value for heterogeneity being p=0.20. From both of these traits (early versus other and early versus late), meta-analyses of the genetic effects of B*18:01 and B*44:03 did not reach statistical significance; all other associations were statistically significant overall. For late age of onset we observed evidence for inter-study heterogeneity for the effect of B*39:06 on this outcome yielding τ=0.086 p=0.05. By meta-analysis the only genetic effects that were significantly associated with late age of onset were the DR-DQ genotype and A*24:02, indicating that these are the most consistent effects throughout the age of onset distribution. In the absence of significant heterogeneity within the T1DGC sub-cohorts we have merged them for the risk prediction analysis.
We then assessed whether these HLA markers could predict early or late age at onset
A logistic regression on early onset and late onset outcomes was then fitted using three different models for each outcome: (1) gender as the only risk factor (2) HLA as the only risk factor (3) gender and HLA as risk factors. Early versus late models were not fitted given the small sample sizes involved.
The following models were fitted:
The risk discrimination and calibration results from these models in all cohorts are shown in Table 4.
The best prediction for early onset was seen in the JOS cohort for an HLA only model yielding an area under the curve (AUC) = 0.700 (Table 4). For late onset, the best AUC was seen in the DAN cohort for a model including both gender and HLA (AUC=0.644). For all other cohorts and outcomes except one, at least one of the models had an AUC value significantly higher than a value of 0.5 at which the test would have no predictive value. None of the risk prediction models is significantly different from 0.5 in the Sardinian cohorts.
For late onset the models that included gender and HLA had calibration problems in two of the cohorts because of the heterogenous relationship between gender and age of onset among cohorts. We also note that the large confidence intervals for AUC in some of the cohorts is likely due to the smaller sample size available in those studies.
In the current study we have investigated the role of HLA high resolution genotypes on age at T1D onset in various populations of European descent. To our knowledge this is the first study to compare the role of HLA on age of onset in different populations.
We further investigated whether such genotyping information could have any predictive value for assessing the risk of very young age at onset in contrast to late onset for type 1 diabetes. Using the largest data set to date to address this question we found that the strongest genetic contribution to age of onset appears to come from the DRB1-DQB1 genotypes, which also have the strongest influence on disease risk . In addition a few select class I alleles, notably A*24:02, B*18:01, B*39:06, and B*44:03 also influence age at onset. Of these the most consistent effect is that of A*24:02, whereas the other class I alleles either do not influence specific cut-offs of age of onset (early versus late) or show evidence of strong heterogeneity across populations (e.g. B*39:06 for late age of onset).
In both the T1DGC collection and other independent extant collections, we find that genotypes for classical class I and class II HLA can have some modest predictive power for these two outcomes. On the one hand, this confirms the role of HLA polymorphism in influencing age at onset. On the other hand, it highlights that other risk factors not included in our models must also be influencing age at type 1 diabetes onset.
Current approaches for the prediction of type 1 diabetes in screening studies take advantage of the major genetic risk factors, genotyping for HLA-DR and HLA-DQ loci and screening for autoantibodies directed against islet-cell antigens . For example, children who carry both of the highest-risk HLA haplotypes (DR3/DR4–DQB1*03:02) have a risk of approximately 1 in 20 for a diagnosis of T1D by the age of 15 years . The results presented here may help improve such models by taking into account also the role of genetic risk factors on age at onset.
We found that gender had little or no predictive value and that because the relationship with age of onset was not consistent across cohorts in most instances it did not improve the AUC. For the DAN and HBDI where the difference in age of onset between genders was strongest, the inclusion of gender did show a slight improvement but not in an additive way. This consistent with what has been reported for the combination of genetic and non genetic factors for other disease areas .
We note several study limitations. Our analyses have used data derived from affected sib pair cohorts of European descent, selecting for patients with a strong genetic contribution to type 1 diabetes and therefore possibly also to its age at diagnosis. The current data are thus reflective of the prediction of HLA in a group of patients enriched for genetic risk. On the other hand, these data are relevant to clinical research, as studies of first degree relatives (follow-up, prevention trials) involve those who have a family member [16, 18] already diagnosed with T1D and genetic factors combined with other factors could be applied to the analysis of data from cohorts of relatives. In addition, these results highlight the differences between European descent populations and illustrate the limits and the extent to which HLA may be helpful in predicting age at disease onset.
We have developed and calibrated three risk prediction models for age at early and late onset of type 1 diabetes, based on five independent patient collections. We hope that these models may be used as pilots to lead further research in defining risk prediction for age at onset using other risk factors (e.g., environmental exposures, autoantibodies). The models may be applied at the individual level to predict the most likely category of age at onset (early or late), but also at the population level, with reference to other relative risks from published studies, to estimate the potential population risk reduction that may be gained by primary prevention of any modifiable risk factors that influence type 1 diabetes and the ensuing complications.
This research utilizes resources provided by the Type 1 Diabetes Genetics Consortium, a collaborative clinical study sponsored by the National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK), National Institute of Allergy and Infectious Diseases (NIAID), National Human Genome Research Institute (NHGRI), National Institute of Child Health and Human Development (NICHD), and Juvenile Diabetes Research Foundation International (JDRF) and supported by U01 DK062418. This work was also supported by R01 DK61722 (J. A. N.).
Author Contributions: H.A.E. contributed to the study conception, data collection, revised the article for crtically important content and gave final approval to the version to be published, J.C. contributed to the study conception, data collection, revised the article for crtically important content and gave final approval to the version to be published, M.V. contributed to the study conception, data collection, revised the article for crtically important content and gave final approval to the version to be published, P.V.M contributed to the study conception, data collection, revised the article for crtically important content and gave final approval to the version to be published, J.An. . contributed to the study conception, data collection, revised the article for crtically important content and gave final approval to the version to be published, A.M.V. contributed to the study conception, data collection, data analyses, wrote the manuscript and gave final approval to the version to be published.
Duality of interest statement: The authors declare that there is no duality of interest associated with this manuscript.