In a British cohort (the Whitehall II study), a panel of 20 genotypes associated with type 2 diabetes performed less well than the Cambridge and Framingham offspring type 2 diabetes risk scores in discriminating incident cases of type 2 diabetes. Adding the genetic panel to the phenotype based risk models did not improve discrimination and produced only minimal improvement in accuracy of risk estimation assessed by recalibration and, at best, a minor net reclassification improvement.
Over the past five years, the pace of identification of genetic loci underlying susceptibility to common diseases has increased rapidly, leading to interest in how this information might best be used to improve personal and public health. One potential application is the use of genetic information to help predict susceptibility to disease in initially healthy people, so as to focus preventive interventions on those at the highest risk of future disease. This targeted approach to prevention is exemplified by the established use of risk equations based on non-genetic variables to estimate risk of coronary heart disease and guide blood pressure lowering and cholesterol lowering treatment.
3 4 This approach to the prevention of vascular disease, for which diabetes is a major risk factor, will become more systematic in the next two years, through the Department of Health’s vascular health check scheme (
www.dh.gov.uk/en/Publicationsandstatistics/Publications/PublicationsPolicyAndGuidance/DH_083822).
Preventive interventions also exist for type 2 diabetes, which motivated the recent evaluation of risk scores (including those studied here) for the prediction of type 2 diabetes. The Cambridge risk score and the Framingham offspring risk score are based on a combination of demographic, family history, anthropometric, and biochemical data, but neither includes genetic information.
7 8 Although these phenotype based risk models seem to perform well, an important question is whether typing a panel of validated genetic risk factors might improve their ability to predict type 2 diabetes. Some studies in this area have used case-control datasets.
37 38 39 Although efficient for gene discovery, these are a suboptimal design for evaluating the predictive performance of a marker, as risk information is available only in relative terms and the range of metrics that can be derived to assess predictive performance is more limited than for prospective studies with incident cases of disease. Those prospective studies that have previously evaluated the performance of genetic markers have been set outside the UK, typed fewer type 2 diabetes risk alleles, or reported only some of a range of metrics available to evaluate the performance of a predictive test (table 6).
35 40 41 42 In a prospective study set in the UK, we therefore tested the performance of a panel of 20 common genes associated with type 2 diabetes, each of which confers a small to moderate increase in the risk of type 2 diabetes, and compared prediction based on genetic information alone, phenotypic information alone, and both, by using a range of metrics to assess predictive performance.
| Table 6 Comparison of published studies that have used genetic information with or without non-genetic risk factors to discriminate between people with and without type 2 diabetes, in case-control, cross sectional, or prospective settings |
We found that risk functions based on routinely measured clinical variables better discriminated incident type 2 diabetes cases than did a panel of 20 diabetes associated single nucleotide polymorphisms. The inclusion of genetic information in the risk models did not improve the discrimination of cases of type 2 diabetes, and nor did it provide clinically important improvement in the accuracy of these models when assessed by calibration. The addition of genetic data to phenotype based risk models also provided minimal net reclassification improvement. The addition of genetic information resulted in the reassortment of people into different risk categories, but not all the shifts were helpful. Although some eventual cases were upgraded to higher risk categories, almost as many had their risk downgraded, and the opposite was true for many of those who remained healthy.
Our findings are consistent with the nine previous published reports of 10 study populations (table 6),
35 37 38 40 41 42 43 44 even though the number and range of genotypes and the phenotype based risk models used for prediction varied across studies (see Forrest plots at
www.ucl.ac.uk/genetic-epidemiology/WebMaterial). All models included age, body mass index, and sex, and much of the predictive information in any phenotype based model is likely to be encompassed in these terms.
The relations shown in figure 2 and the web figure illustrate one reason for the poor predictive performance of a panel of single nucleotide polymorphisms associated with common diseases. Although people carrying multiple risk alleles are at more extreme risk of type 2 diabetes than those carrying fewer copies, they represent only a small proportion of the population, because the inheritance of each risk allele is an independent event—the probability of inheriting multiple risk alleles is a function of the frequency of each allele in the population. For example, the probability of inheriting 10 independent risk alleles with frequencies around 0.3 is 0.310 (about 6×10−6). People with an intermediate number of risk alleles would therefore be expected to account for the major portion of cases of type 2 diabetes, because of the large number of people at intermediate risk in the population. This explains the substantial overlap of the distribution of risk alleles among people who developed diabetes and those who remained disease-free, which makes it difficult to set a cut-off point of a gene count (or genetic risk function) that reliably discriminates later cases of type 2 diabetes. Although genetic tests for type 2 diabetes, based on a subset of the alleles studied here, can already be purchased in the commercial sector, our findings suggest that much more rigorous evaluation of their use as a health technology is needed before such tests should be adopted by healthcare organisations.
As a technology, however, genotype based tests have several inherent advantages over non-genetic tests. Genotype based assays are cheap, have high fidelity, and can be multiplexed, in contrast to multiple phenotypic risk factors and biomarkers, many of which require different methods for their measurement, and which are more affected by biological variability and measurement error than is genotyping. Moreover, because genotype is invariant it offers the prospect of risk assessment from much earlier in life than is possible with phenotype based tests. In the case of cardiovascular risk factors, evidence shows that greater benefits accrue from earlier intervention among people at higher risk (for example, in the form of smoking cessation or cholesterol lowering).
45 46 The findings of our study should thus not lead to the premature dismissal of genotype based risk prediction as a health technology. Rather, increased efforts should be made to understand the strengths and limitations of such tests as well as their optimal place in health care, a conclusion highlighted in the recent House of Lords Science and Technology Committee’s report on genomic medicine (
www.publications.parliament.uk/pa/ld200809/ldselect/ldsctech/107/107i.pdf).
Limitations of study
Some limitations of our study should be noted. Although prospective, the Whitehall II study is workplace based and therefore not necessarily representative of the general population. However, the excellent performance in Whitehall II of the non-genetic risk functions for type 2 diabetes, both of which were developed and validated in general populations, suggests that this is unlikely to bias our conclusions substantially. Moreover, our findings are consistent with those of prospective studies set in representative general populations. Our findings are also not generalisable to people of non-European ancestry, who we excluded from this analysis. Although DNA was collected some time after baseline, which could have introduced a survivor bias, we think that this is unlikely to have affected our results given the modest effect of the alleles we studied on risk of diabetes and the long natural history of the development of the life threatening complications of diabetes.
The two risk tools studied, based on non-genetic markers, performed better than genotype based tests despite the fact that the models, which were developed in different datasets, were not specifically recalibrated for the Whitehall II population. The common diabetes associated single nucleotide polymorphisms we studied might have greater incremental value in the prediction of type 2 diabetes when evaluated against some of the other validated risk models. However, we chose the Framingham and Cambridge risk scores because they are contemporary (which could be important, given the recent increase in the incidence and prevalence of type 2 diabetes), were developed in populations with a similar profile to the Whitehall II participants, and were based on studies set in the United States and the UK, where many of the genetic studies were done. Moreover, both include variables that are routinely measured in clinical practice. We did not evaluate QDRisk, which is based on routinely collected primary care data (including deprivation scores, ethnicity, and current drug treatment for hypertension or cardiovascular disease and corticosteroid use), which was reported during the preparation of this manuscript.
47Because part of the information included in the family history component of a risk score will reflect common genotypes, this may have undermined the incremental value of genetic information for risk prediction. However, the variants we studied explain only a small proportion of the familial aggregation of diabetes.
36 Whether genotypes have greater predictive utility in particular categories of patient (such as among leaner people or those of a particular ancestry) could be assessed by pooling participant level data from a large number of prospective studies with the relevant information to ensure adequate power. Our current analysis is limited to the 20 common risk alleles for type 2 diabetes identified by large association or genome-wide studies. However, sequence variants of intermediate frequency but larger effect size are likely to be uncovered by future research, so our interpretation on the predictive utility of genotype should be regarded as interim. Moreover, as the actual causal variants at each gene/region remain for the most part uncertain, the predictive utility of genetic markers may also have been underestimated.
Our conclusions about the performance of genetic testing for type 2 diabetes are confined to the use of single common alleles at each locus. Other common risk alleles are likely to exist at the same genetic loci (including the causal variants), which could provide additional information relevant to prediction. Our conclusions are also not transferrable to other common diseases. For example, genetic variants underlying the susceptibility to age related macular degeneration have been identified, at least one of which is both common and large in its effect on risk.
48 We previously examined the predictive utility of a common single nucleotide polymorphism associated with the risk of coronary heart disease at the 9p23.1 chromosomal locus (rs10757274) when added to a risk function that included variables incorporated in the Framingham coronary heart disease risk equation.
49 Although this genotype added minimally to the ability of the Framingham risk score to discriminate future events, improving the area under the receiver operating characteristics curve by only 3%, it did significantly improve reclassification of risk of coronary heart disease, albeit modestly. Moreover, for some disorders, including age related macular degeneration, few if any non-genetic biomarkers or risk factors exist that can be used to estimate risk of future disease.
Conclusions
Phenotype based risk models (the Framingham offspring and Cambridge risk scores) provided greater discrimination for type 2 diabetes than did models based on 20 common independently inherited alleles associated with risk of type 2 diabetes. The addition of 20 common genotypes associated with modest risk to phenotype based risk models produced only minimal improvement in the accuracy of risk estimation assessed by recalibration and at best a minor net reclassification improvement. The major translational application of the currently known common, small effect genetic variants influencing susceptibility to type 2 diabetes is likely to come from the insight they provide on causes of disease and potential therapeutic targets.
What is already known on this topic
- Several routinely used anthropometric and biological measures are included in the validated Cambridge and the Framingham offspring type 2 diabetes risk models
- Common single nucleotide polymorphisms associated with susceptibility to type 2 diabetes have been identified from whole genome and candidate gene association studies
- The extent to which a comprehensive panel of genotypes will help in predicting incident diabetes in the UK is not known
What this study adds
- A panel of 20 type 2 diabetes associated genotypes performed less well than the Cambridge and Framingham offspring type 2 diabetes risk scores in discriminating incident cases of type 2 diabetes
- Including the genetic panel did not improve discrimination, calibration, or reclassification when added to the Cambridge or Framingham risk models
- The major translational application of currently known common type 2 diabetes associated genotypes is likely to arise from the insight they provide on causes of disease and therapeutic targets