|Home | About | Journals | Submit | Contact Us | Français|
Multiple genetic loci have been convincingly associated with the risk of type 2 diabetes mellitus. We tested the hypothesis that knowledge of these loci allows better prediction of risk than knowledge of common phenotypic risk factors alone.
We genotyped single-nucleotide polymorphisms (SNPs) at 18 loci associated with diabetes in 2377 participants of the Framingham Offspring Study. We created a genotype score from the number of risk alleles and used logistic regression to generate C statistics indicating the extent to which the genotype score can discriminate the risk of diabetes when used alone and in addition to clinical risk factors.
There were 255 new cases of diabetes during 28 years of follow-up. The mean (±SD) genotype score was 17.7±2.7 among subjects in whom diabetes developed and 17.1±2.6 among those in whom diabetes did not develop (P<0.001). The sex-adjusted odds ratio for diabetes was 1.12 per risk allele (95% confidence interval, 1.07 to 1.17). The C statistic was 0.534 without the genotype score and 0.581 with the score (P=0.01). In a model adjusted for sex and self-reported family history of diabetes, the C statistic was 0.595 without the genotype score and 0.615 with the score (P=0.11). In a model adjusted for age, sex, family history, body-mass index, fasting glucose level, systolic blood pressure, high-density lipoprotein cholesterol level, and triglyceride level, the C statistic was 0.900 without the genotype score and 0.901 with the score (P=0.49). The genotype score resulted in the appropriate risk reclassification of, at most, 4% of the subjects.
A genotype score based on 18 risk alleles predicted new cases of diabetes in the community but provided only a slightly better prediction of risk than knowledge of common risk factors alone.
TYPE 2 DIABETES MELLITUS IS A MAJOR health problem worldwide.1 Fortunately, its development can be prevented in many instances,2 and persons at risk can be readily identified with the measurement of a few common risk factors.3-5 Type 2 diabetes is heritable, with a risk for people with familial diabetes as compared with those without familial diabetes that is increased by a factor of 2 to 6.6,7 Recent genetic association studies have provided convincing evidence that several novel loci are associated with the risk of diabetes,8-13 each with a 5 to 37% increase in the relative odds of diabetes per risk allele.13 These loci may account for the familial basis of diabetes, and their discovery may herald a new era in the prediction of the disease, whereby individual loci might be combined into a genetic risk score for enhanced detection of persons at risk for diabetes.14-17
In clinical practice, a few common risk factors that can be easily measured are powerful harbingers of type 2 diabetes.5 Despite the intuitive appeal of a genetic risk score, it remains an untested hypothesis that genetic information allows better prediction of the risk of diabetes than knowledge of common risk factors alone.18,19 We tested this hypothesis in the framework of three perspectives. First, we asked whether autosomal genetic information at birth, when only sex (the combination of X and Y chromosomes) is known, improves the ability to identify people who are at increased risk for the development of diabetes by middle age. Second, we asked whether genotype information would add to knowledge of family history, which is commonly considered to represent genetic risk. Finally, we asked whether adding genetic information to information on risk factors that are commonly measured at a clinical examination in adulthood improves the prediction of risk. We used this framework to test the ability of a panel of 18 single-nucleotide polymorphisms (SNPs) that are known to have associations with the risk of diabetes to predict new cases of type 2 diabetes in a large, prospectively examined, community-based cohort.
The Framingham Heart Study commenced in 1948 with the enrollment of 5209 people of European ancestry, 28 to 62 years of age, residing in Framingham, Massachusetts; the participants were subsequently examined every 2 years. The Framingham Offspring Study commenced in 1971 with the enrollment of 5124 offspring of the original cohort and the spouses of the offspring; participants were 5 to 70 years of age at the first examination.20 Participants were next examined 8 years later and then every 4 years thereafter through examination 7 (which took place in the period between 1999 and 2001). Of the 5124 participants in the Framingham Offspring Study, 2776 were genotyped for 18 SNPs. For 2377 of these participants, complete phenotypic and follow-up data over the course of one or more observational periods were available, as well as complete information on at least 15 of 18 genotyped SNPs. The study was approved by the institutional review board at Boston University, and written informed consent was obtained from all participants.
Each examination consisted of a medical history taking, physical examination, and collection of a fasting blood sample.21 In the sixth examination cycle (1995 through 1998), participants completed a self-administered questionnaire that asked about family history of disease. We defined a positive self-reported family history of diabetes as a report that one or both parents had diabetes; this definition is more than 56% sensitive and 97% specific for confirmed parental diabetes.22 Parental diabetes was confirmed by means of direct observation of the original cohort, over the course of 46 years of observation after their enrollment in the Framingham Heart Study, at the end of which time the mean age of surviving parents was 83 years. We considered diabetes to be present in a parent when medication was prescribed to control the diabetes or when the casual plasma glucose level was 11.1 mmol per liter or higher or 200.0 mg per deciliter or higher at any examination. We defined diabetes to be present in an offspring when treatment was prescribed to control the diabetes or when the fasting plasma glucose level was 7.0 mmol per liter or higher or 126.0 mg per deciliter or higher at any examination. More than 99% of the cases of diabetes among the participants in the Framingham Offspring Study are type 2 diabetes.6
We used two recent diabetes genomewide association studies10,13 to select 17 SNPs confirmed to be associated with type 2 diabetes in populations of European ancestry. We added one SNP, rs689, in the INS locus, that we had previously found to be associated with diabetes (P=0.02) in Framingham Offspring Study participants.23 Using these 18 SNPs, we constructed a genotype score ranging from 0 to 36 on the basis of the number of risk alleles. Genotyping was performed with the use of iPLEX (Sequenom).24 The minimum call rate was 96.9%, the average consensus rate from 254 duplicates was 99.5%, and all SNPs were in Hardy-Weinberg equilibrium (P>0.02).25
We used mixed-effects models to compare the mean genotype score for persons in whom diabetes developed with the score for those in whom diabetes did not develop. We estimated the cumulative incidence of diabetes by dividing the number of persons in whom diabetes developed as of the end of follow-up by the total number at risk. We used pooled logistic-regression models with generalized estimating equations to examine the association between the genotype score and the risk that diabetes would develop over the course of 28 years. The use of generalized estimating equations accounts for the presence of related persons in the sample,26 and the method of pooling person-examinations accounts for time-dependent risk factors, providing valid estimates of effect similar to those obtained with the use of time-dependent Cox analyses.27,28 We pooled three examination periods (examinations 1 and 2, 3 and 4, and 5 through 7) to test the 8-to-10-year risk of diabetes, as we did in our previous diabetes prediction model.5 For these analyses, subjects with diabetes at the first examination of each period were excluded, and new cases of diabetes were enumerated through the end of the last examination in the period. We constructed a series of genotype-score models that were adjusted for sex, for sex and self-reported family history of diabetes, and for risk factors identified in our previously published and validated “simple clinical model,” including sex, family history, age, body-mass index, fasting plasma glucose level, systolic blood pressure, high-density lipoprotein cholesterol level, and triglyceride level.5,29 We had previously estimated the precision of this model using bootstrap resampling.5 Since self-report is the method by which information on family history is almost always collected, we used self-reported family history of diabetes in the primary analysis.
We calculated odds ratios and 95% confidence intervals associated with each additional risk allele for each SNP individually and in the genotype score. Using C statistics that were compared with a nonparametric approach, we evaluated the discriminatory capability of the models with the genotype score as compared with the models without the genotype score.30 We also evaluated risk reclassification with the use of the genotype score, according to the method developed by Pencina et al. for determining net reclassification improvement.31 We assessed model calibration using the Hosmer-Lemeshow chi-square test.32 We used categories of genotype score to calculate likelihood ratios and posterior probabilities of diabetes.33 Statistical analyses were performed with the use of SAS software, version 8 (SAS Institute). A two-tailed P value of less than 0.05 was considered to indicate statistical significance.
Characteristics of the 2377 subjects at the baseline of each of the three cross-sectional periods are shown in Table 1. From the inception of the study in 1971 to the baseline of the third period in 1987, subjects gained weight and levels of risk factors became more adverse. Through the end of follow-up, 255 cases of diabetes accumulated over 6130 person-examinations. Characteristics of 18 risk loci are shown in Table 2. The risk alleles were common (with a risk-allele frequency of >10%), genotype frequencies were similar to those in other samples of subjects of European ancestry, and in this population, few individual odds ratios were significant. The mean (±SD) genotype score was 17.7±2.7 among subjects in whom diabetes developed and 17.1±2.6 among those in whom diabetes did not develop (P<0.001), and the cumulative incidence of diabetes increased significantly with increasing score (P<0.001) (Fig. 1, and Fig. 1 and 2 in the Supplementary Appendix, available with the full text of this article at www.nejm.org). Of 2377 subjects, 2.7% were homozygous for any risk allele, 16.2% were heterozygous for the risk allele, and 8.4% had no risk allele at half or more (≥9 of 18 SNPs) of the loci.
Three regression models for predicting diabetes are shown in Table 3, and in Figure 3 in the Supplementary Appendix. The C statistic for the sex-adjusted model was low (0.534) but improved significantly with the addition of the genotype score (0.581, P=0.01), and the relative risk for diabetes increased by 12% per risk allele. In a model adjusted for sex and self-reported family history of diabetes, the C statistic was modest without the genotype score (0.595) and did not improve significantly with the score. In a model adjusted for the risk factors included in the simple clinical model, the C statistic was excellent without the genotype score (0.900) and did not improve significantly with the score, and the genetic relative risk remained constant at 11% per risk allele.
Net reclassification by means of the genotype score over the course of 8 to 10 years of follow-up is shown in Table 4, and in Figure 4 in the Supplementary Appendix. In the model that was adjusted for sex alone, the genotype score appropriately reclassified 4.1% of the participants (P=0.004), primarily by reclassifying lower-risk persons into higher-risk categories. In the models adjusted for sex and family history or for simple clinical risk factors, the genotype score appropriately reclassified smaller proportions of participants (≤2.6%, P≥0.17).
Adjustment for age diminished the genotype score’s capacity to improve discrimination, with a C statistic in the sex-adjusted model of 0.729 without the score and 0.741 with the score (P=0.05) (Table 1 in the Supplementary Appendix). However, when the sample was stratified by age (<50 years vs. ≥50 years), discrimination with the addition of the genotype score appeared to be substantially better among younger subjects (C statistic, 0.532 to 0.609; P=0.009; net reclassification improvement, 11.9%; P=0.009) than among older subjects (C statistic, 0.530 to 0.558; P=0.20; net reclassification improvement, 0.47%; P=0.92). The interaction of age with genotype score was not significant (P=0.37).
C statistics for models classifying the genotype score into three groups were virtually identical to those for models using the genotype score per risk allele. In the model adjusted for sex, the odds ratio for the development of diabetes in the middle-score group (genotype score, 16 to 20) as compared with the low-score group (genotype score, ≤15) was 1.62 (95% confidence interval [CI], 1.14 to 2.29), and the odds ratio for the development of diabetes in the high-score group (genotype score, ≥21) as compared with the low-score group was 2.60 (95% CI, 1.68 to 4.02) (Fig. 1, and Table 2 in the Supplementary Appendix). Positive likelihood ratios for estimating the posterior probability of diabetes were 0.62, 1.04, and 1.73 in the low-score, middle-score, and high-score groups, respectively (Table 3 in the Supplementary Appendix). Models that used a weighted genotype score had discriminatory properties that were similar to those in models that used the unweighted score (Tables 4, 5, and 6 in the Supplementary Appendix). Sex-adjusted odds ratios for diabetes associated with directly observed parental diabetes as compared with no parental diabetes were 1.91 (95% CI, 1.44 to 2.55) without the genotype score and 1.82 (95% CI, 1.37 to 2.43) with the genotype score, and C statistics were 0.576 without the genotype score and 0.604 with the score (P=0.048). Models that used risk factors in clinical categories had results that were similar to those of models that used continuous risk factors (Tables 7 and 8 in the Supplementary Appendix).
In a community-based sample followed for 28 years, we found that a genotype score for type 2 diabetes, based on 18 loci, was associated with a very modest but significant 12% increase in the relative risk of diabetes per risk allele. Adjustment for family history and common risk factors did not diminish the size or significance of this association. Irrespective of clinical variables, people with the highest genotype scores as compared with those with the lowest scores had a risk that was increased by a factor of 2.6. Although the individual risk alleles were common, only a small percentage of people had at least one risk allele at half or more of the loci. It might be expected that a score based on common variants would not be an efficient discriminator of risk, owing to weak effects for individual alleles.34 However, we found that a combination of risk alleles was a strong risk factor with modest discriminatory ability when sex alone was considered, especially among younger persons. This finding might be useful for genetic screening at birth or in youth, before obvious risk factors have developed.
When familial diabetes or clinical risk factors that are typically documented at a periodic examination in adulthood were considered, the genotype score did not improve risk discrimination. A possible explanation for this finding is that some alleles might increase the risk through these intermediate traits or that phenotypic risk factors are overwhelmingly stronger determinants of the near-term risk of diabetes than are known genetic influences. Findings were similar with use of a score in which loci were weighted according to previous evidence of their association with diabetes. The results suggest that “personalized medicine” that is made possible by the expanded understanding of genetics is not yet as useful for the prediction of the risk of diabetes in adults as it is for other potential applications such as pharmacogenetic analyses of drug toxicity or response.
A few other studies have examined the use of combinations of SNPs to predict the risk of diabetes. In the Botnia study, people with risk alleles in both the gene encoding for the peroxisome proliferator-activated receptor gamma (PPARG) and the gene encoding for the cystein protease calpain 10 (CAPN10), as compared with people who had no risk alleles, had a risk that was increased by a factor of 2.6,14 but the use of risk alleles as predictors did not result in a better C statistic for diabetes (0.68) than did the use of fasting glucose level and body-mass index.18 In a study from the United Kingdom, subjects with all six risk alleles in the gene encoding for potassium inwardly-rectifying channel, subfamily J, member 11 (KCNJ11), PPARG, and the transcription factor 7-like gene (TCF7L2) (1% of the sample), in comparison with subjects who had no risk alleles, had a risk that was increased by a factor of 5 to 7; the C statistic with the three loci as predictors was 0.58.15 In the Data from an Epidemiological Study on the Insulin Resistance Syndrome (DESIR) study, carriers of at least 4 risk alleles in the genes encoding for glucokinase (GCK), interleukin 6 (IL6), and TCF7L2, as compared with those who had one risk allele or none, had a risk that was increased by a factor of 2.5.17 The C statistic with these three variants as predictors was 0.56, and the C statistic with these loci plus age, sex, and body-mass index as predictors was 0.82. The Genetics of Diabetes Audit and Research Tayside (GoDARTS) study examined 18 risk loci.35 Carriers of more than 24 risk alleles (1.2% of the sample), as compared with carriers of 10 to 12 risk alleles, had a prevalence ratio of 4.2. The C statistic with all variants combined as predictors was 0.60; the C statistic with age, body-mass index, and sex as predictors was 0.78; and the C statistic with variants and risk factors as predictors was 0.80. Our data extend these studies to show that individual per-allele effects are small; that people with more risk alleles are at greater risk than those with fewer, no matter how many or which genes are considered; that groups with an apparently greatly increased genetic risk can be identified but are not commonly found; and that the marginal ability of genotype scores to discriminate risk is small, with minimal effect after consideration of even a few common risk factors.
We found that the presence or absence of parental diabetes and the genotype score were independently associated with the risk of diabetes. This suggests that family history as a risk factor for diabetes conveys more than heritable genetic information; it probably includes nongenetic familial behaviors and norms. The lower relative risks for diabetes associated with observed parental diabetes as compared with those associated with self-reported family history (approximately 1.8 vs. approximately 2.2) support the contention that family history contains more risk information than is implied by inheritance of the diabetes phenotype alone.
One of the limitations of our study is that the 18 SNPs we included are probably insufficient to account for the familial risk of diabetes. They account for a minority of diabetes heritability, and the SNP array platforms from which they were chosen capture only approximately 80% of common variants in Europeans. In addition, we have not considered structural variants that might confer a risk of diabetes. It is possible that the addition of rare risk alleles with large effects, or a much larger number of common risk alleles with small individual effects, could improve discrimination.36 Indeed, as many as 500 loci may underlie the genetic risk of type 2 diabetes.16 Also, we did not study interactions among genes or between genes and the environment that might alter the genetic risk in exposed persons. As more diabetes risk variants become known, their incorporation into the genotype score may explain more of the genetic risk implied by parental diabetes.
Our study has other limitations. There were few significant associations between individual risk alleles and diabetes in the Framingham Offspring Study cohort, but this finding was expected, given that alleles of small effect were tested in a community-based sample of modest size, and the aggregate set of 18 SNPs was predictive of new cases of diabetes. The participants in the Framingham Offspring Study are essentially all of European ancestry; allelic variation may require that different SNPs be used to generate a genotype score in different ancestry groups.37 Our genotype score gave all alleles the same weight; this may not be a true reflection of the biologic basis of type 2 diabetes. We considered the marginal value of the genotype score after accounting for only phenotypic risk factors, without consideration of behavioral risk factors for diabetes.38 We expect that accounting for unhealthful behaviors associated with the risk of diabetes would only further diminish the discriminatory capacity of a genotype score. However, persons with relatively less healthful lifestyle behaviors might be more susceptible to genetic risk than those with more healthful behaviors.39 Whether the genotype score would have value in predicting the risk of diabetes in specific subgroups that have an elevated risk on the basis of poor health habits remains to be tested.
In summary, a genotype score based on 18 risk alleles predicted new cases of diabetes in the community but did not result in a substantially better prediction of risk than the knowledge of common phenotypic risk factors alone. Although genetic information appeared to be useful when only factors known in youth were considered, genetic information in the context of risk factors measured in adulthood did not help to refine the prediction of diabetes risk. Our findings underscore the view that identification of adverse phenotypic characteristics remains the cornerstone of approaches to predicting the risk of type 2 diabetes.19
Supported by a contract from the National Heart, Lung, and Blood Institute’s Framingham Heart Study (N01-HC-25195), grants from the National Institute for Diabetes and Digestive and Kidney Diseases (NIDDK) (R01 DK078616 and K24 DK080140, to Dr. Meigs), an NIDDK Research Career Award (K23 DK65978, to Dr. Florez), and the Boston University Linux Cluster for Genetic Analysis, funded by a grant from the National Institutes of Health National Center for Research Resources Shared Instrumentation (1S10RR163736-01A1).
Dr. Meigs reports serving on a consultancy board for Interleukin Genetics and receiving grants from Sanofi-Aventis and GlaxoSmithKline; and Dr. Florez, receiving consulting fees from Merck and Publicis Healthcare Communications Group, a global advertising agency engaged by Amylin Pharmaceuticals. No other potential conflict of interest relevant to this article was reported.