|Home | About | Journals | Submit | Contact Us | Français|
To highlight recent type 2 diabetes (T2D)-associated genetic discoveries and their potential for clinical application
The advent of genome-wide association screening has uncovered many loci newly associated with T2D. This review describes the techniques applied to discover novel T2D genes and compares their relative strengths, biases, and findings to date. The results of large-scale genome-wide association studies carried out since 2007 are summarized, and limitations of interpreting this preliminary data are offered. Recent studies exploring the clinical potential of these discoveries are reviewed, focusing on insights into T2D pathogenesis, risk prediction of future diabetes, and utility in guiding pharmacotherapy. The new T2D-associated loci have been implicated in β-cell development and function, highlighting insulin secretion in the disease process. Preliminary risk prediction studies show that more loci are needed to improve T2D risk indices. Studies have also revealed that genes may play a role in the pharmacologic response to anti-diabetic medications.
Since 2007 genome-wide association studies have rapidly increased the number of T2D-associated loci. This review summarizes the history of genetic association studies, the results from the new genome-wide association studies and the clinical application of these findings.
The clinical assessment of type 2 diabetes (T2D) has often incorporated genetic information in the form of family history. While extremely simple, family information has helped raise clinical vigilance for an individual patient’s risk of T2D due to the strong heritability of this disease. In contrast to a population risk of ~7%, the offspring of a diabetic parent has a 40% chance of developing T2D, and this risk rises to 70% if both parents are affected . As our technology to probe an individual’s genes has developed, our understanding of the genetic basis of T2D has become increasingly complex and more clinically relevant.
Prior to the complete sequencing of the human genome, efforts to define the genetic basis of T2D relied on linkage analysis and candidate gene approaches. Genetic linkage relies on the principle of identity by descent, (i.e. shared DNA segments [loci] that arise from a common ancestor and segregate in families) combined with phenotypic information to identify the loci that correlate with phenotype. While the unbiased nature of this method (i.e. no prior assumptions are made regarding which loci are related to the disease phenotype) makes it particularly appealing, pedigree information is required, limiting the inclusion of many individual genomes and thus statistical power to detect modest genotype-phenotype correlation signals. Furthermore, linkage methods perform best when the risk mutations are relatively rare and have a high degree of penetrance: thus, they paved the way for discovery of monogenic forms of non-ketogenic diabetes that present in adults under 25 years of age. This clinical syndrome, called maturity onset diabetes of the young (MODY), comprises 1-3% of diabetes and includes separate (MODY 1-6 described to date) single-gene disorders that are dominantly inherited . It was less fruitful, however, in the identification of generalizable T2D loci, with the possible notable exception of TCF7L2. TCF7L2 encodes a transcription factor implicated in the wnt signaling pathway and the transcription of proglucagon genes; its precise role in T2D molecular pathogenesis remains under investigation .
The candidate gene approach is biased, in that it assumes that a specific locus is associated with disease prior to testing. Genetic variants of the gene in question are identified by focused sequencing and assessed by genotyping them in a large number of cases versus controls. If a variant is overrepresented within a given phenotype it is said to be associated with the trait of interest. This strategy allows for the detection of association signals of modest effect (odds ratios 1.05-1.5) when large samples are assembled.
The P12A (proline to alanine change at position 12) variant of the peroxisome proliferator activated receptor γ2 (PPARG), the molecular target for thiazolidinedione medications, was the first genetic variant reproducibly associated with T2D and illustrates this approach. Individuals homozygous for the proline variant show increased insulin resistance and are at 20% greater risk for developing T2D . A similar examination of the islet ATP-sensitive potassium channel Kir6.2 yielded a reproducible association of the E23K polymorphism in the KCNJ11 gene with T2D  and defects in insulin secretion [7, 8] Variants in two other candidate genes implicated in monogenic diabetes (WFS1, which gives rise to Wolfram syndrome and HNF1B, which causes a form of MODY) have been associated with common T2D via this approach [9-11]. Thus, despite significant investments, the candidate gene strategy has yielded few true T2D associations.
The failure of linkage and candidate gene approaches to account for the genetic basis of greater than 95% of T2D suggests that most of the genetic contribution to T2D arises from multiple loci with individually small effects. The sequencing of the human genome, the realization that human genetic variation can be cataloged by examining ~1 in every 300 base pairs which vary among individuals (i.e. single nucleotide polymorphisms or SNPs), and advancements in genotyping technology and analyses empowered investigators to assess the genetic basis of T2D in a systematic, unbiased manner that could uncover loci of modest effects . The international HapMap project cataloged a significant proportion of the common SNPs (estimated around 10 million) in four ethnic populations, and by determining the correlative structure of these SNPs within each chromosome made it possible to assess the allelic identity of the great majority of them of by examining only a fraction of “tag” SNPs that can be easily arrayed on a genotyping chip. This enabled researchers to perform case-control studies of large numbers of individuals with and without T2D and screen for risk loci throughout the entire genome, in a process called genome-wide association scanning (GWAS).
The first GWAS for T2D was published in 2007 and performed in a French cohort where cases had been selected below a BMI threshold, replicating the known association of TCF7L2 with T2D risk and uncovering a missense SNP in SLC30A8 (encoding for a zinc transporter protein expressed in β cells) and an association with HHEX (a transcription factor involved in β cell development) . Soon thereafter, investigators at deCODE and their collaborators published a GWAS corroborating the HHEX and SLC30A8 associations while also identifying CDKAL1 (a putative islet glucotoxicity sensor) . Three other groups, the Wellcome Trust Case Control Consortium (WTCCC) , the Finland-United States Investigation of NIDDM Genetics (FUSION) group , and the Diabetes Genetics Initiative (DGI) , simultaneously published GWAS on the same day after prior sharing of data. These studies confirmed the known associations with TCF7L2, KCNJ11, PPARG, HHEX and SLC30A8, independently identified CDKAL1, and uncovered additional associations with IGF2BP2 (insulin-like growth factor binding protein) and CDKN2A/B (a cyclin dependent kinase tumor suppressor). Table 1 summarizes the characteristics and populations of the GWAS and Table 2 indicates the newly discovered associations, their statistical strength and their effect size. The DGI, FUSION, and WTCCC GWASs were combined in a formal meta-analysis by the DIAGRAM (Diabetes Genetics Replication And Metaanalysis) consortium, whose findings were replicated in an independent cohort of 50,000 samples which utilized, among others, the samples genotyped in the deCODE GWAS. Six new loci were identified: JAZF1, CDC123-CAMK1D, TSPAN8-LGR5, THADA, ADAMTS9, and NOTCH2-ADAM30 (see Table 2).
Following the publication of these initial GWAS and the meta-analysis that combined most of them, there has been intense speculation about the clinical utility and impact of their results. As such it is important to identify the limitations of this technique. While genome-wide scanning is a powerful way to rapidly and systematically uncover new associations, it does not circumvent the process of refining the associated loci to find the precise “causal” DNA sequences (causal in the sense that altering these sequences would eliminate the diabetic phenotype). Thus, loci identified by GWAS require in-depth sequencing and functional studies of the cellular and molecular effects of genes in that region. Many of the genes reported in Table 2 represent the nearest coding sequences to associated SNPs as determined by each GWAS, but the intervening sequences can be large. For example the CDKN2A and CDKN2B genes are ~150 kb away from the associated SNP. Conversely, SNP variants may exert their molecular effects at remote sites even when they are relatively close to other uninvolved genes. Fortunately, several of the reported association signals map to regions containing genes known to be involved in insulin secretion or expressed in the pancreas. Nevertheless, these “candidate genes” require the same laborious elucidation of mechanism and validation that the MODY genes and PPARG required.
Attempts to generalize the data from the current GWAS must also carefully account for the way these scans were constructed. The largest effect found (in TCF7L2) shows a per-allele odds ratio of 1.4, which effectively translates into the ~10% of Europeans who carry 2 copies of the risk allele having about twice the lifetime risk of T2D as the 40% who have no risk alleles . The DIAGRAM meta-analysis provided close to 100% statistical power to detect effect sizes corresponding to diabetes risk increased by 40% or greater, making it very unlikely that common loci of larger effects were missed (at least in populations of European ancestry). However, the existing catalog of human variation permitted the current GWAS to capture only common variants (i.e. SNPs that have a minor allele frequency of >5%). Thus, it can be said with certainty that common SNPs of large effects do not account for most of T2D heritability, but no statements can be made for rarer loci and their effect sizes. Regarding common variants, the loci identified by the current GWAS are estimated to explain only 5-10% of the genetic basis of T2D. Therefore, many additional loci, both rare and common, remain to be discovered.
Many of the genes near the newly identified T2D loci are related to β-cell function and insulin secretion. This has been viewed as presumptive evidence that insulin secretion plays a more important etiologic role in T2D than insulin resistance. The seemingly skewed nature of these findings is likely to be partially related to the selection process for cases and controls in the 2007 GWAS. The French and DGI groups selected relatively lean cases to control for the effect of obesity. Insulin resistance and obesity are highly correlated, and thus by deliberately minimizing the confounding influence of obesity those scans maximized the chances of identifying insulin secretion genes. As a case in point, the WTCCC group (which did not control for BMI in cases vs controls) identified a locus near FTO associated with T2D. When the BMI effect was statistically accounted for the association disappeared, indicating that the diabetes risk associated with the FTO locus is mediated by obesity . Insulin resistance genes may also have smaller effect sizes which the current GWAS were underpowered to detect, may be relatively rare and not tagged by the current set of SNPs, or their manifestation may be subject to stronger environmental influences .
An understanding of the genetics basis of diabetes has guided clinical practice and benefited patients with monogenic diabetes. However, these insights have not been easily translatable to most T2D. There is a great deal of optimism that the infusion of new loci from GWAS will broaden our clinical understanding of T2D and inform approaches to specific patients. Theoretically, association loci could impact three separate clinical areas: disease nosology, risk prediction, and pharmacogenetics.
Nosology, the classification of disease, is particularly germane to diabetes. While the clinical diagnosis is simply based on fasting or post-load glucose levels, it is clear that hyperglycemia per se represents a detectable common pathway in the syndrome but does not imply a single molecular pathogenesis. In this sense, diabetes is somewhat of an incompletely defined phenotype; genetic information has already been used to define hyperglycemic syndromes more precisely. For example, after HNF1B was identified as the single gene defect that resulted in MODY5, it was observed that MODY5 patients consistently had renal cysts resulting in the definition of a novel syndrome . Thus, the discovery of an underlying genotype for a subset of diabetic patients allowed an improvement in clinical phenotyping.
Attempts to use the latest GWAS loci to refine the diabetic phenotype and connect it to specific mechanistic pathways are underway. Initial efforts have focused on exploring the relationship of the new loci to quantitative glycemic traits ranging from fasting glucose and insulin levels, to more complex measures of insulin secretion and resistance [22-24]. These studies have corroborated that most of the GWAS loci are associated with deficiencies in insulin secretion responses to glucose challenge. Data for quantitative traits has also come from GWAS performed in non-diabetic participants for fasting plasma glucose (FPG) as a quantitative trait. The genes encoding the glucose-6-phosphatase catalytic subunit (G6PC2)    and melatonin receptor 2 (MTNR1B)    , but notably only the latter is associated with T2D.
Since genotypes are fixed at birth, the use of genetic information in risk prediction for future disease presents an obvious and attractive clinical application. Several groups have constructed risk scores using the newly discovered loci, with and without clinical risk prediction parameters, to determine whether genetic information allows for better prediction of diabetes. Using a cohort from the Framingham Offspring Study followed for 28 years, Meigs and colleagues  constructed a genotype score from 18 reported T2D SNPs. They compared three risk assessment models with and without genotype score: risk stratified by sex, family history, or clinical risk factors (age, sex, BMI, FPG, systolic blood pressure, HDL cholesterol and triglycerides). Using the C-statistic, a quantitative measure of a test’s ability to discriminate true from false positives (ranging from 0.5/no discrimination to 1.0/perfect discrimination), they showed statistically significant enhancements of risk prediction by addition of the genotype score only to the model that included sex information alone. Interestingly, they found that a self-reported parental history of diabetes (which they were able to corroborate from records) was an independent predictor of risk even when the genotype score was added to the model, suggesting that a family history likely captures more than purely genetic information, e.g. shared environment and behaviors. While analyzing their model for different sub-groups, the authors showed that when applied to subjects under 50 years of age the genotype score increased the C-statistic, and in a clinically more meaningful metric allowed for 12% of this group to be “correctly” reclassified into a high-risk group. This supports the notion that genetic factors may be useful in early detection of at-risk groups before clinical risk factors such as BMI or FPG manifest themselves allowing physicians to target effective, but long-term interventions such as lifestyle modification earlier. Using a similar genotype score with 16 SNPs in a Swedish cohort, Lyssenko and coworkers  similarly found little clinically relevant improvement in risk stratification with genotype score, but an improved ability of genetic information to predict risk of diabetes in subjects followed longer. They further showed that clinical risk factors conversely decrease in their discriminative ability with longer duration of follow-up, suggesting a role for genetic testing in younger patients. This lack of clinically meaningful improvement in risk prediction from addition of currently available genetic information has also been shown in a British  and Dutch  cohorts.
Given that the majority of diabetes risk alleles remain to be discovered, it is not surprising that a genotype score created from the currently available information has not improved diabetes risk prediction in a clinically significant way. Many more loci may need to be incorporated into the risk score to improve the discriminatory power of genetic information. Loci with higher odds ratios (>2) for diabetes would also enhance discriminatory power. It is worth noting that risk prediction with currently available clinical information is already excellent (C-statistics ranging from 0.7-0.9 in the above studies), which raises the threshold for additional genetic information to contribute meaningfully. As further efforts are made to incorporate genetic risk predictors from GWAS, the resulting information will have to be validated in a population-specific manner .
Pharmacogenetics describes the study of interactions between genetic loci and pharmacologic therapy. Its clinical application lies in directing specific therapies to subsets of patients based on efficacy and side effects. Monogenic diabetes has provided the most stunning examples of the clinical utility of this approach. Investigators have shown convincingly that patients with the clinical syndrome of permanent neonatal diabetes (diabetes presenting in the first weeks or months of life) who had mutations in the β-cell potassium channel Kir6.2  (encoded by KCNJ11) or the sulfonylurea receptor SUR1 (encoded by ABCC8)  could be safely and effectively treated with high dose sulfonylureas in place of insulin. The use of sulfonylureas allowed for better glycemic control with no increased risk in hypoglycemia, and eliminated the clinical burden of insulin injections in infants and young children . Pearson et al. illustrated this concept for MODY patients in a randomized crossover trial of gliclazide and metformin in 36 patients, either with diabetes caused by HNF1A (MODY1) or T2D. Gliclazide was superior to metformin in MODY1 patients by glycemic endpoints, and that MODY1 patients responded to gliclazide better than patients with T2D. This improved glycemic response was physiologically based on a large insulin secretion response to sulfonylureas specific to MODY1 patients. By contrast, patients with MODY2 (who have deleterious mutations in the glucokinase gene leading to a glucose sensing defect) failed to improve glycemic control with low/moderate dose insulin or oral hypoglycemics in a 3-month assessment . This stands to reason, as a defect in glucose sensing resulting in a higher setpoint of glycemia would cause simply cause stimulus for endogenous insulin production to shut down with the addition of exogenous insulin or oral hypoglycemic agents. Importantly, knowledge of the genetic basis of these patients’ diabetes alters clinical decision making as MODY2 patients rarely suffer secondary complications even without treatment.
Extending these results to polygenic diabetes has naturally been more difficult. In assessing the sulfonylurea response in diabetic patients carrying the common E23K variant in the KCNJ11 gene, investigators have found conflicting results: some have shown that carriers of the lysine (K) risk allele have a greater risk of sulfonylurea failure than E23E homozygotes , while others have not been able to find a difference . Assessment of variation at the PPARG locus has been equally conflicted: some investigators have demonstrated that carriers of a P12A polymorphism have an improved response to thiazolidinediones than P12P homozygotes , but several others have been unable to show a difference [41-43]. Other common polymorphisms in the PPARG locus have also been tested and show no relationship to thiazolidinedione treatment response [42-44], Most recently, 25 common SNPs in 11 candidate genes were screened in a Chinese cohort treated with gliclazide for 8 weeks: subjects with the Ala/Ala genotype at the ABCC8 A1369S polymorphism had a 7.7% greater decrease in FPG than Ser/Ser carriers . There were no statistically significant differences in HbA1c, although there was a consistent trend. Interestingly, this SNP is highly correlated with the E23K variant in the nearby KCNJ11 gene. These results must be verified in other populations, but given the ~18% of patients that would be expected to be Ala/Ala carriers, this finding has the potential to be clinically relevant at the population level.
Whereas the above pharmacogenetic approaches rely on educated guesses using candidate loci, understanding the actual biological functions of associated loci is not necessary to apply them clinically. TCF7L2 is thought to play a role in insulin secretion by influencing the incretin pathway, but its full function remains to be elucidated . Nevertheless, study of the T risk allele and oral hypoglycemic response has already yielded some interesting results. In a retrospective, observational study, Pearson et al. identified sulfonylurea and metformin-treated individuals in a Scottish cohort and genotyped them for the TCF7L2 variant . They found that TT carriers were more likely to fail sulfonylurea treatment in a gene-dose dependent fashion; the effect of metformin response was independent of genotype. In a separate metformin study, Shu et al. reported that individuals carrying a reduced function allele in OCT-1 (organic cation transporter 1, which plays a role in hepatic metformin uptake) resulted in higher glucose levels during and OGTT in metformin treated non-diabetic subjects . Notably, the risk allele was not predictive of OGTT glucose levels when individuals were not treated with metformin. These data suggest that screening for the genetic basis of pharmacotherapeutic response in the same unbiased way ushered by GWAS could uncover new pathways that affect drug response but are completely independent of diabetes molecular pathogenesis.
Historically, the genetic basis of diabetes has been informed by the discovery and characterization of single gene defects giving rise to diabetes. These discoveries have provided instructive examples of how understanding the genetic basis of diabetes can clinically inform treatment decisions and establish risk for progeny. However, monogenic forms of diabetes account for only 5% of diabetic patients. Genome-wide association screening has demonstrated that more common genetic variants individually have a relatively small association with the diabetic phenotype, but preliminary biochemical and cellular studies have shown that this is a viable method to rapidly and systematically uncover new molecular elements in diabetes pathogenesis. The 18 loci that have been discovered in the past 2 years alone indicate the explosion in discovery rate. These loci are still in early stages of functional characterization and they represent only 10% of the estimated genetic susceptibility of T2D. Exploration of the remaining loci will require a denser and deeper map of human genome variation (currently being constructed). The direct clinical application of newly discovered loci for genetic risk prediction is currently premature as it does not increase the discriminative ability over that of common clinical risk factors. Whether these genetic loci will direct response to different classes of therapeutics must be empirically tested.
Dr. Florez is supported by NIH Research Career Award K23 DK65978-05, a Physician Scientist Development Award by the Massachusetts General Hospital, and a Clinical Scientist Development Award from the Doris Duke Charitable Foundation. Dr. Florez has received consulting honoraria from Merck, bioStrategies, XOMA and Publicis Healthcare Communications Group, a global advertising agency engaged by Amylin Pharmaceuticals.