Getting from the extremes to a comprehensive view of diabetes genetics.
As described above, success in the identification of genes impacting on individual risk of diabetes has come from two distinct approaches to gene discovery. The first, linkage mapping within monogenic and syndromic families, has delivered causal variants that are rare but highly penetrant. The second, large-scale association mapping, is now yielding growing numbers of common variants: these have, at best, modest effect sizes and low penetrance. Several genes are featured in the lists generated by both approaches. For example, mutations in
KCNJ11,
PPARG,
WFS1, and
TCF2 (
HNF1B) are causal for syndromic and/or monogenic forms of diabetes, while common variants in these same genes influence predisposition to typical type 2 diabetes (
55,
56,
64–
66). While common variants in
GCK (another gene causal for MODY) do not influence type 2 diabetes risk per se, they have a clear impact on fasting glucose levels within the population (
88).
Of course, none of this should come as a surprise. Once a gene has been shown to harbor one variant associated with, or causal for, a diabetes-like phenotype, it becomes far more likely that other nearby variants (provided they exert some effect on the expression and/or function of the gene) will also have a detectable phenotypic effect. By the same token, the genotype-phenotype relationships revealed by these gene discovery efforts highlight the pathways involved as prime candidates for beneficial therapeutic or preventative manipulation, a view reinforced by the fact that at least two of the genes involved in both monogenic and multifactorial forms of diabetes (PPARG, KCNJ11) encode the targets of proven diabetes drugs.
However, it should be obvious that these two “flavors” of polymorphism—rare and highly penetrant on the one hand and low penetrant on the other—are not the only options when it comes to the variants that might influence disease susceptibility. It seems probable, even likely, that between these extremes lies a class of medium frequency, medium penetrance variants that have until now escaped the gaze of the gene mappers.
Such variants would have penetrances too low to generate Mendelian patterns of segregation and frequencies too low to be covered by current GWA approaches. Despite this, such variants have particularly attractive translational properties. For example, a variant where the risk allele has a frequency of 1% and produces in a per-allele OR of ~3 would provide greater predictive power than the known variants in
TCF7L2. Variants with such characteristics are increasingly being reported in other disease states (breast cancer and hyperlipidemia) (
89,
90) and have even been reported in type 2 diabetes (
91). In principle, just 30 such variants across the genome could explain the observed familial aggregation of type 2 diabetes in a way that the current set of common, low-penetrance variants cannot. Such a pool of variants would also provide an excellent tool for individual diabetes-risk prediction, generating a discriminative accuracy on receiver-operating characteristic analysis close to 80%. The advent of new high-throughput sequencing technologies, allied to large-scale association analysis, brings variants in this class within the range of genetic discovery and should allow researchers to evaluate the contribution to disease susceptibility attributable to variants that lie between the extremes where previous attention has been focused.
There are many other challenges to be faced and opportunities to be realized in the years ahead. The first of these lies in extending the range of variants that are accessible to scrutiny, beyond the low-frequency variants referred to in the previous paragraph, to a systematic evaluation of structural polymorphisms (insertions, deletions, and duplications) and variants that influence methylation status (
92). Another lies in characterizing the association signals that have been found: large-scale resequencing and fine-mapping strategies will be required to recover the full allelic spectrum of causal variants and thereby obtain the most precise quantification of the genetic effects attributable to each locus. The part played by nonadditive interactions between different genetic loci and between susceptibility variants and environmental exposures needs to be charted, and discovery and replication studies need to be extended beyond the European populations that have been the focus of much of the current research.
Moving beyond genetics, there is work to be done to understand the novel (molecular, cellular, and physiological) biology revealed by these discoveries. If, as seems probable, many of the causal variants lie in noncoding regions, often some distance from the nearest coding sequence, they will often have subtle, spatially and/or temporally restricted effects. In such circumstances, gathering experimental evidence of their functional impact will be seriously difficult.
The final challenge lies in placing gene discovery into translational context. The clinical utility and validity of genetic diagnostics are already established in monogenic diabetes, where such testing can influence clinical practice and treatment. However, diagnostic genetic testing, still underutilized by most diabetologists, and further research, development, and education are required. It is a major challenge to establish how to use knowledge from the identification of predisposing polymorphisms in type 2 diabetes to improve the care of the diabetic patient. Definition of the underlying polymorphisms and genes is but a first step on this road.