Search tips
Search criteria

Results 1-25 (1189301)

Clipboard (0)

Related Articles

1.  Integrated Enrichment Analysis of Variants and Pathways in Genome-Wide Association Studies Indicates Central Role for IL-2 Signaling Genes in Type 1 Diabetes, and Cytokine Signaling Genes in Crohn's Disease 
PLoS Genetics  2013;9(10):e1003770.
Pathway analyses of genome-wide association studies aggregate information over sets of related genes, such as genes in common pathways, to identify gene sets that are enriched for variants associated with disease. We develop a model-based approach to pathway analysis, and apply this approach to data from the Wellcome Trust Case Control Consortium (WTCCC) studies. Our method offers several benefits over existing approaches. First, our method not only interrogates pathways for enrichment of disease associations, but also estimates the level of enrichment, which yields a coherent way to promote variants in enriched pathways, enhancing discovery of genes underlying disease. Second, our approach allows for multiple enriched pathways, a feature that leads to novel findings in two diseases where the major histocompatibility complex (MHC) is a major determinant of disease susceptibility. Third, by modeling disease as the combined effect of multiple markers, our method automatically accounts for linkage disequilibrium among variants. Interrogation of pathways from eight pathway databases yields strong support for enriched pathways, indicating links between Crohn's disease (CD) and cytokine-driven networks that modulate immune responses; between rheumatoid arthritis (RA) and “Measles” pathway genes involved in immune responses triggered by measles infection; and between type 1 diabetes (T1D) and IL2-mediated signaling genes. Prioritizing variants in these enriched pathways yields many additional putative disease associations compared to analyses without enrichment. For CD and RA, 7 of 8 additional non-MHC associations are corroborated by other studies, providing validation for our approach. For T1D, prioritization of IL-2 signaling genes yields strong evidence for 7 additional non-MHC candidate disease loci, as well as suggestive evidence for several more. Of the 7 strongest associations, 4 are validated by other studies, and 3 (near IL-2 signaling genes RAF1, MAPK14, and FYN) constitute novel putative T1D loci for further study.
Author Summary
Genome-wide association studies have helped locate gene variants that affect our susceptibility to diseases. The analysis of these studies is typically straightforward: test each genetic variant whether it is correlated with predisposition to disease. This approach often works well for identifying commonly occurring variants with moderate effects on disease risk. However, the effects of many variants are so small they fail to register statistically significant correlations. This is a concern because many diseases are modulated by many genetic factors with small effects on disease risk. An alternative is to examine groups of variants, such as variants sharing a common pathway, and assess whether these groups are “enriched” for correlations with disease. This can be a more effective approach to identifying genetic factors relevant to disease. However, it does not tell us which genes are associated with disease. To address this limitation, we describe an approach that integrates enrichment analysis with tests for disease-variant correlations within a single framework. We illustrate this approach in genome-wide studies of seven complex diseases. We show that our approach supports enriched pathways in several diseases, and uncovers disease-susceptibility genes in these pathways not identified in conventional analyses of the same data.
PMCID: PMC3789883  PMID: 24098138
2.  Prioritizing genes for follow-up from genome wide association studies using information on gene expression in tissues relevant for type 2 diabetes mellitus 
BMC Medical Genomics  2009;2:72.
Genome-wide association studies (GWAS) have emerged as a powerful approach for identifying susceptibility loci associated with polygenetic diseases such as type 2 diabetes mellitus (T2DM). However, it is still a daunting task to prioritize single nucleotide polymorphisms (SNPs) from GWAS for further replication in different population. Several recent studies have shown that genetic variation often affects gene-expression at proximal (cis) as well as distal (trans) genomic locations by different mechanisms such as altering rate of transcription or splicing or transcript stability.
To prioritize SNPs from GWAS, we combined results from two GWAS related to T2DM, the Diabetes Genetics Initiative (DGI) and the Wellcome Trust Case Control Consortium (WTCCC), with genome-wide expression data from pancreas, adipose tissue, liver and skeletal muscle of individuals with or without T2DM or animal models thereof to identify T2DM susceptibility loci.
We identified 1,170 SNPs associated with T2DM with P < 0.05 in both GWAS and 243 genes that were located in the vicinity of these SNPs. Out of these 243 genes, we identified 115 differentially expressed in publicly available gene expression profiling data. Notably five of them, IGF2BP2, KCNJ11, NOTCH2, TCF7L2 and TSPAN8, have subsequently been shown to be associated with T2DM in different populations. To provide further validation of our approach, we reversed the approach and started with 26 known SNPs associated with T2DM and related traits. We could show that 12 (57%) (HHEX, HNF1B, IGF2BP2, IRS1, KCNJ11, KCNQ1, NOTCH2, PPARG, TCF7L2, THADA, TSPAN8 and WFS1) out of 21 genes located in vicinity of these SNPs were showing aberrant expression in T2DM from the gene expression profiling studies.
Utilizing of gene expression profiling data from different tissues of individuals with or without T2DM or animal models thereof is a powerful tool for prioritizing SNPs from WGAS for further replication studies.
PMCID: PMC2815699  PMID: 20043853
3.  Multiple type 2 diabetes susceptibility genes following genome-wide association scan in UK samples 
Science (New York, N.Y.)  2007;316(5829):1336-1341.
The molecular mechanisms involved in the development of type 2 diabetes are poorly understood. Starting from genome-wide genotype data for 1,924 diabetic cases and 2,938 population controls generated by the Wellcome Trust Case Control Consortium, we set out to detect replicated diabetes association signals through analysis of 3,757 additional cases and 5,346 controls, and by integration of our findings with equivalent data from other international consortia. We detected diabetes susceptibility loci in and around the genes CDKAL1, CDKN2A/CDKN2B and IGF2BP2 and confirmed the recently described associations at HHEX/IDE and SLC30A8. Our findings provide insights into the genetic architecture of type 2 diabetes, emphasizing the contribution of multiple variants of modest effect. The regions identified underscore the importance of pathways influencing pancreatic beta cell development and function in the etiology of type 2 diabetes.
PMCID: PMC3772310  PMID: 17463249
4.  Pathway analysis of breast cancer genome wide association study highlights three pathways and one canonical signaling cascade 
Cancer research  2010;70(11):4453-4459.
Genome-wide association studies (GWAS) focus on relatively few highly significant loci while less attention is given to other genotyped markers. Employing pathway analysis to existing GWAS data may shed light on relevant biological processes, and illuminate new candidate genes. We employed a pathway-based approach to the breast cancer GWAS data of the National Cancer Institute (NCI) Cancer Genetic Markers of Susceptibility (CGEMS) project that includes 1145 cases and 1142 controls. Pathways were retrieved from three databases: KEGG, BioCarta, and the NCI’s Protein Interaction Database (PID). Genes were represented by their most strongly associated SNP, and an enrichment score (ES) reflecting the overrepresentation of gene-based association signals in each pathway was calculated using a weighted Kolmogorov-Smirnov procedure. Finally, hierarchical clustering was used to identify pathways with overlapping genes, and clusters with excess of association signals were determined by the adaptive rank-truncated product (ARTP) method. A total of 421 pathways containing 3962 genes were included in our study. Of these, three pathways (‘Syndecan-1-mediated signaling ‘, ‘Signaling of Hepatocyte Growth Factor Receptor’ and ‘Growth Hormone Signaling’) were highly enriched with association signals (PES < 0.001, False Discovery Rate (FDR) = 0.118). Our clustering analysis revealed that pathways containing key components of the RAS/RAF/MAPK canonical signaling cascade, were significantly more likely to have excess of association signals than expected by chance (PARTP = 0.0051, FDR = 0.07). These results suggest that genetic alterations associated with these three pathways and one canonical signaling cascade may contribute to breast cancer susceptibility.
PMCID: PMC2907250  PMID: 20460509
Pathways; GWAS; Breast cancer; Susceptibility; Genetics
5.  TXNIP Regulates Peripheral Glucose Metabolism in Humans  
PLoS Medicine  2007;4(5):e158.
Type 2 diabetes mellitus (T2DM) is characterized by defects in insulin secretion and action. Impaired glucose uptake in skeletal muscle is believed to be one of the earliest features in the natural history of T2DM, although underlying mechanisms remain obscure.
Methods and Findings
We combined human insulin/glucose clamp physiological studies with genome-wide expression profiling to identify thioredoxin interacting protein (TXNIP) as a gene whose expression is powerfully suppressed by insulin yet stimulated by glucose. In healthy individuals, its expression was inversely correlated to total body measures of glucose uptake. Forced expression of TXNIP in cultured adipocytes significantly reduced glucose uptake, while silencing with RNA interference in adipocytes and in skeletal muscle enhanced glucose uptake, confirming that the gene product is also a regulator of glucose uptake. TXNIP expression is consistently elevated in the muscle of prediabetics and diabetics, although in a panel of 4,450 Scandinavian individuals, we found no evidence for association between common genetic variation in the TXNIP gene and T2DM.
TXNIP regulates both insulin-dependent and insulin-independent pathways of glucose uptake in human skeletal muscle. Combined with recent studies that have implicated TXNIP in pancreatic β-cell glucose toxicity, our data suggest that TXNIP might play a key role in defective glucose homeostasis preceding overt T2DM.
Vamsi Mootha, Leif Groop, and colleagues report that TXNIP regulates insulin-dependent and -independent pathways of glucose uptake in human skeletal muscle and that its expression is elevated in individuals with prediabetes and type 2 diabetes.
Editors' Summary
An epidemic of diabetes mellitus is threatening world health. 246 million people (6% of the world's population) already have diabetes and it is estimated that within 20 years, 380 million people will have this chronic disease, most of them in developing countries. Diabetes is characterized by high blood sugar (glucose) levels. It arises when the pancreas does not make enough insulin (type 1 diabetes) or when the body responds poorly to insulin (type 2 diabetes). Insulin, which is released in response to high blood glucose levels, instructs muscle, fat, and liver cells to take glucose (a product of food digestion) out of the bloodstream; cells use glucose as a fuel. Type 2 diabetes, which accounts for 90% of all cases of diabetes, is characterized by impaired glucose uptake by target tissues in response to insulin (this “insulin resistance” is one of the first signs of type 2 diabetes) and inappropriate glucose release from liver cells. Over time, the pancreas may also make less insulin. These changes result in poor glucose homeostasis (inadequate control of blood sugar levels), which can cause life-threatening complications such as kidney failure and heart attacks.
Why Was This Study Done?
If the world diabetes epidemic is to be halted, researchers need a better understanding of glucose homeostasis and need to identify which parts of this complex control system go awry in type 2 diabetes. This information might suggest ways to prevent type 2 diabetes developing in the first place and might reveal targets for drugs that could slow or reverse the disease process. In this study, the researchers have used multiple approaches to identify a new mediator of glucose homeostasis and to investigate whether this mediator is causally involved in the development of type 2 diabetes.
What Did the Researchers Do and Find?
The researchers took small muscle samples from people who did not have diabetes before and after increasing their blood insulin levels and used a technique called “microarray expression profiling” to identify genes whose expression was induced or suppressed by insulin. One of the latter genes was thioredoxin interacting protein (TXNIP), a gene whose expression is strongly induced by glucose yet suppressed by insulin. They next used previously published microarray expression data to show that TXNIP expression was consistently higher in the muscles of patients with diabetes or prediabetes (a condition in which blood glucose levels are slightly raised) than in normal individuals. The researchers then examined whether TXNIP expression was correlated with glucose uptake, again using previously published data. In people with no diabetes and those with prediabetes, as glucose uptake rates increased, TXNIP expression decreased but this inverse correlation was missing in people with diabetes. Finally, by manipulating TXNIP expression levels in insulin-responsive cells grown in the laboratory, the researchers found that TXNIP overexpression reduced basal and insulin-stimulated glucose uptake but that reduced TXNIP expression had the opposite effect.
What Do These Findings Mean?
These results provide strong evidence that TXNIP is a regulator of glucose homeostasis in people. Specifically, the researchers propose that TXNIP regulates glucose uptake in the periphery of the human body by acting as a glucose- and insulin-sensitive switch. They also suggest how it might be involved in the development of type 2 diabetes. Early in the disease process, a small insulin deficiency or slightly raised blood sugar levels would increase TXNIP expression in muscles and suppress glucose uptake by these cells. Initially, the pancreas would compensate for this by producing more insulin, but this compensation would eventually fail, allowing blood sugar levels to rise sufficiently to increase TXNIP expression in the pancreas. Previously published results suggest that this would induce the loss of insulin-producing cells in the pancreas, thus further reducing insulin production and glucose uptake in the periphery and, ultimately, resulting in type 2 diabetes. Although there are many unanswered questions about the exact role of TXNIP in glucose homeostasis, these results help to explain many of the changes in glucose control that occur early in the development of diabetes. Furthermore, they suggest that interventions designed to modulate the activity of TXNIP might break the vicious cycle that eventually leads to type 2 diabetes.
Additional Information.
Please access these Web sites via the online version of this summary at
The MedlinePlus encyclopedia has pages on diabetes
The US National Institute of Diabetes and Digestive and Kidney Diseases has information for patients on diabetes
Information on diabetes is available for patients and professionals from the US Centers for Disease Control and Prevention
The American Diabetes Association provides information on diabetes for patients
International Diabetes Federation has information on diabetes and a recent press release on the global diabetes epidemic
PMCID: PMC1858708  PMID: 17472435
6.  Two common genetic variants near nuclear-encoded OXPHOS genes are associated with insulin secretion in vivo 
European Journal of Endocrinology  2011;164(5):765-771.
Mitochondrial ATP production is important in the regulation of glucose-stimulated insulin secretion. Genetic factors may modulate the capacity of the β-cells to secrete insulin and thereby contribute to the risk of type 2 diabetes.
The aim of this study was to identify genetic loci in or adjacent to nuclear-encoded genes of the oxidative phosphorylation (OXPHOS) pathway that are associated with insulin secretion in vivo.
Design and methods
To find polymorphisms associated with glucose-stimulated insulin secretion, data from a genome-wide association study (GWAS) of 1467 non-diabetic individuals, including the Diabetes Genetic Initiative (DGI), was examined. A total of 413 single nucleotide polymorphisms with a minor allele frequency ≥0.05 located in or adjacent to 76 OXPHOS genes were included in the DGI GWAS. A more extensive population-based study of 4323 non-diabetics, the PPP-Botnia, was used as a replication cohort. Insulinogenic index during an oral glucose tolerance test was used as a surrogate marker of glucose-stimulated insulin secretion. Multivariate linear regression analyses were used to test genotype–phenotype associations.
Two common variants were identified in the DGI, where the major C-allele of rs606164, adjacent to NADH dehydrogenase (ubiquinone) 1 subunit C2 (NDUFC2), and the minor G-allele of rs1323070, adjacent to cytochrome c oxidase subunit VIIa polypeptide 2 (COX7A2), showed nominal associations with decreased glucose-stimulated insulin secretion (P=0.0009, respective P=0.003). These associations were replicated in PPP-Botnia (P=0.002 and P=0.05).
Our study shows that genetic variation near genes involved in OXPHOS may influence glucose-stimulated insulin secretion in vivo.
PMCID: PMC3080761  PMID: 21325017
7.  New Susceptibility Loci Associated with Kidney Disease in Type 1 Diabetes 
Sandholm, Niina | Salem, Rany M. | McKnight, Amy Jayne | Brennan, Eoin P. | Forsblom, Carol | Isakova, Tamara | McKay, Gareth J. | Williams, Winfred W. | Sadlier, Denise M. | Mäkinen, Ville-Petteri | Swan, Elizabeth J. | Palmer, Cameron | Boright, Andrew P. | Ahlqvist, Emma | Deshmukh, Harshal A. | Keller, Benjamin J. | Huang, Huateng | Ahola, Aila J. | Fagerholm, Emma | Gordin, Daniel | Harjutsalo, Valma | He, Bing | Heikkilä, Outi | Hietala, Kustaa | Kytö, Janne | Lahermo, Päivi | Lehto, Markku | Lithovius, Raija | Österholm, Anne-May | Parkkonen, Maija | Pitkäniemi, Janne | Rosengård-Bärlund, Milla | Saraheimo, Markku | Sarti, Cinzia | Söderlund, Jenny | Soro-Paavonen, Aino | Syreeni, Anna | Thorn, Lena M. | Tikkanen, Heikki | Tolonen, Nina | Tryggvason, Karl | Tuomilehto, Jaakko | Wadén, Johan | Gill, Geoffrey V. | Prior, Sarah | Guiducci, Candace | Mirel, Daniel B. | Taylor, Andrew | Hosseini, S. Mohsen | Parving, Hans-Henrik | Rossing, Peter | Tarnow, Lise | Ladenvall, Claes | Alhenc-Gelas, François | Lefebvre, Pierre | Rigalleau, Vincent | Roussel, Ronan | Tregouet, David-Alexandre | Maestroni, Anna | Maestroni, Silvia | Falhammar, Henrik | Gu, Tianwei | Möllsten, Anna | Cimponeriu, Danut | Ioana, Mihai | Mota, Maria | Mota, Eugen | Serafinceanu, Cristian | Stavarachi, Monica | Hanson, Robert L. | Nelson, Robert G. | Kretzler, Matthias | Colhoun, Helen M. | Panduru, Nicolae Mircea | Gu, Harvest F. | Brismar, Kerstin | Zerbini, Gianpaolo | Hadjadj, Samy | Marre, Michel | Groop, Leif | Lajer, Maria | Bull, Shelley B. | Waggott, Daryl | Paterson, Andrew D. | Savage, David A. | Bain, Stephen C. | Martin, Finian | Hirschhorn, Joel N. | Godson, Catherine | Florez, Jose C. | Groop, Per-Henrik | Maxwell, Alexander P.
PLoS Genetics  2012;8(9):e1002921.
Diabetic kidney disease, or diabetic nephropathy (DN), is a major complication of diabetes and the leading cause of end-stage renal disease (ESRD) that requires dialysis treatment or kidney transplantation. In addition to the decrease in the quality of life, DN accounts for a large proportion of the excess mortality associated with type 1 diabetes (T1D). Whereas the degree of glycemia plays a pivotal role in DN, a subset of individuals with poorly controlled T1D do not develop DN. Furthermore, strong familial aggregation supports genetic susceptibility to DN. However, the genes and the molecular mechanisms behind the disease remain poorly understood, and current therapeutic strategies rarely result in reversal of DN. In the GEnetics of Nephropathy: an International Effort (GENIE) consortium, we have undertaken a meta-analysis of genome-wide association studies (GWAS) of T1D DN comprising ∼2.4 million single nucleotide polymorphisms (SNPs) imputed in 6,691 individuals. After additional genotyping of 41 top ranked SNPs representing 24 independent signals in 5,873 individuals, combined meta-analysis revealed association of two SNPs with ESRD: rs7583877 in the AFF3 gene (P = 1.2×10−8) and an intergenic SNP on chromosome 15q26 between the genes RGMA and MCTP2, rs12437854 (P = 2.0×10−9). Functional data suggest that AFF3 influences renal tubule fibrosis via the transforming growth factor-beta (TGF-β1) pathway. The strongest association with DN as a primary phenotype was seen for an intronic SNP in the ERBB4 gene (rs7588550, P = 2.1×10−7), a gene with type 2 diabetes DN differential expression and in the same intron as a variant with cis-eQTL expression of ERBB4. All these detected associations represent new signals in the pathogenesis of DN.
Author Summary
The global prevalence of diabetes has reached epidemic proportions, constituting a major health care problem worldwide. Diabetic kidney disease, or diabetic nephropathy (DN)—the major long term microvascular complication of diabetes—is associated with excess mortality among patients with type 1 diabetes. Even though DN has been shown to cluster in families, the underlying genetic and molecular pathways remain poorly defined. We have undertaken the largest genome-wide association study and meta-analysis to date on DN and on its most severe form of kidney disease, end-stage renal disease (ESRD). We identified new loci significantly associated with diabetic ESRD: AFF3 and an intergenic locus on chromosome 15q26 residing between RGMA and MCTP2. Our functional analyses suggest that AFF3 influences renal tubule fibrosis, a pathological hallmark of severe DN. Another locus in ERBB4 was suggestively associated with DN and resides in the same intronic region as a variant affecting the expression of ERBB4. Subsequent pathway analysis of the genes co-expressed with ERBB4 indicated involvement of fibrosis.
PMCID: PMC3447939  PMID: 23028342
8.  Association of RASGRP1 with type 1 diabetes is revealed by combined follow-up of two genome-wide studies 
Journal of Medical Genetics  2009;46(8):553-554.
The two genome-wide association studies published by us and by the Wellcome Trust Case-Control Consortium (WTCCC) revealed a number of novel loci but neither had the statistical power to elucidate all of the genetic components of type 1 diabetes risk, a task for which larger effective sample sizes are needed.
We analyzed data from two sources: 1) The previously published second stage of our study, with a total sample size of the two stages consisting of 1,046 Canadian case-parent trios and 538 multiplex families with 929 affected offspring from the Type 1 Diabetes Genetics Consortium (T1DGC); 2) The RR2 project of the T1DGC, which genotyped 4,417 individuals from 1,062 non-overlapping families, including 2,059 affected individuals (mostly sibling pairs) for the 1,536 markers with the highest statistical significance for type 1 diabetes in the WTCCC results.
One locus, mapping to an LD block at chr15q14, reached statistical significance by combining results from two markers (rs17574546 and rs7171171) in perfect linkage disequilibrium (LD) with each other (r2=1). We obtained a joint p value of 1.3 ×10−6, which exceeds by an order of magnitude the conservative threshold of 3.26×10−5 obtained by correcting for the 1,536 SNPs tested in our study. Meta-analysis with the original WTCCC genome-wide data produced a p value of 5.83×10−9.
A novel type 1 diabetes locus was discovered. It involves RASGRP1, a gene known to play a crucial role in thymocyte differentiation and TCR signaling by activating the Ras signaling pathway.
PMCID: PMC3272492  PMID: 19465406
Etiology; Genetic susceptibility; Type 1 diabetes; RASGRP1
9.  Exploring genome-wide – dietary heme iron intake interactions and the risk of type 2 diabetes 
Aims/hypothesis: Genome-wide association studies have identified over 50 new genetic loci for type 2 diabetes (T2D). Several studies conclude that higher dietary heme iron intake increases the risk of T2D. Therefore we assessed whether the relation between genetic loci and T2D is modified by dietary heme iron intake.
Methods: We used Affymetrix Genome-Wide Human 6.0 array data [681,770 single nucleotide polymorphisms (SNPs)] and dietary information collected in the Health Professionals Follow-up Study (n = 725 cases; n = 1,273 controls) and the Nurses’ Health Study (n = 1,081 cases; n = 1,692 controls). We assessed whether genome-wide SNPs or iron metabolism SNPs interacted with dietary heme iron intake in relation to T2D, testing for associations in each cohort separately and then meta-analyzing to pool the results. Finally, we created 1,000 synthetic pathways matched to an iron metabolism pathway on number of genes, and number of SNPs in each gene. We compared the iron metabolic pathway SNPs with these synthetic SNP assemblies in their relation to T2D to assess if the pathway as a whole interacts with dietary heme iron intake.
Results: Using a genomic approach, we found no significant gene–environment interactions with dietary heme iron intake in relation to T2D at a Bonferroni corrected genome-wide significance level of 7.33 ×10-8 (top SNP in pooled analysis: intergenic rs10980508; p = 1.03 × 10-6). Furthermore, no SNP in the iron metabolic pathway significantly interacted with dietary heme iron intake at a Bonferroni corrected significance level of 2.10 × 10-4 (top SNP in pooled analysis: rs1805313; p = 1.14 × 10-3). Finally, neither the main genetic effects (pooled empirical p by SNP = 0.41), nor gene – dietary heme–iron interactions (pooled empirical p-value for the interactions = 0.72) were significant for the iron metabolic pathway as a whole.
Conclusions: We found no significant interactions between dietary heme iron intake and common SNPs in relation to T2D.
PMCID: PMC3558725  PMID: 23386860
type 2 diabetes; gene environment interactions; dietary heme iron; pathway analysis
10.  The regulation-of-autophagy pathway may influence Chinese stature variation: evidence from elder adults 
Journal of human genetics  2010;55(7):441-447.
Recent success of genome-wide association studies (GWASs) on human height variation emphasized the effects of individual loci or genes. In this study, we used a developed pathway-based approach to further test biological pathways for potential association with stature, by examining ∼370 000 single-nucleotide polymorphisms (SNPs) across the human genome in 618 unrelated elder Han Chinese. A total of 626 biological pathways annotated by any of the three major public pathway databases (KEGG, BioCarta and Ambion GeneAssist Pathway Atlas) were tested. The regulation-of-autophagy (ROA) (nominal P=0.012) pathway was marginally significantly associated with human stature after our family wise error rate multiple-testing correction. We also used 1000 random recruited US whites for further replication. Interestingly, the ROA pathway presented the strongest signals in whites for height variation (nominal P=0.002). The results correspond to biological roles of the ROA pathway in human long bone development and growth. Our findings also implied that multiple-genetic factors may work jointly as a functional unit (pathway), and the traditional GWASs could have missed important genetic information imbedded in those less significant markers.
PMCID: PMC2923432  PMID: 20448653
autophagy; GWAS; height; pathway; stature
11.  Pathway Analysis for Genome-Wide Association Study of Lung Cancer in Han Chinese Population 
PLoS ONE  2013;8(3):e57763.
Genome-wide association studies (GWAS) have identified a number of genetic variants associated with lung cancer risk. However, these loci explain only a small fraction of lung cancer hereditability and other variants with weak effect may be lost in the GWAS approach due to the stringent significance level after multiple comparison correction. In this study, in order to identify important pathways involving the lung carcinogenesis, we performed a two-stage pathway analysis in GWAS of lung cancer in Han Chinese using gene set enrichment analysis (GSEA) method. Predefined pathways by BioCarta and KEGG databases were systematically evaluated on Nanjing study (Discovery stage: 1,473 cases and 1,962 controls) and the suggestive pathways were further to be validated in Beijing study (Replication stage: 858 cases and 1,115 controls). We found that four pathways (achPathway, metPathway, At1rPathway and rac1Pathway) were consistently significant in both studies and the P values for combined dataset were 0.012, 0.010, 0.022 and 0.005 respectively. These results were stable after sensitivity analysis based on gene definition and gene overlaps between pathways. These findings may provide new insights into the etiology of lung cancer.
PMCID: PMC3585721  PMID: 23469231
12.  Common Inherited Variation in Mitochondrial Genes Is Not Enriched for Associations with Type 2 Diabetes or Related Glycemic Traits 
PLoS Genetics  2010;6(8):e1001058.
Mitochondrial dysfunction has been observed in skeletal muscle of people with diabetes and insulin-resistant individuals. Furthermore, inherited mutations in mitochondrial DNA can cause a rare form of diabetes. However, it is unclear whether mitochondrial dysfunction is a primary cause of the common form of diabetes. To date, common genetic variants robustly associated with type 2 diabetes (T2D) are not known to affect mitochondrial function. One possibility is that multiple mitochondrial genes contain modest genetic effects that collectively influence T2D risk. To test this hypothesis we developed a method named Meta-Analysis Gene-set Enrichment of variaNT Associations (MAGENTA; MAGENTA, in analogy to Gene Set Enrichment Analysis, tests whether sets of functionally related genes are enriched for associations with a polygenic disease or trait. MAGENTA was specifically designed to exploit the statistical power of large genome-wide association (GWA) study meta-analyses whose individual genotypes are not available. This is achieved by combining variant association p-values into gene scores and then correcting for confounders, such as gene size, variant number, and linkage disequilibrium properties. Using simulations, we determined the range of parameters for which MAGENTA can detect associations likely missed by single-marker analysis. We verified MAGENTA's performance on empirical data by identifying known relevant pathways in lipid and lipoprotein GWA meta-analyses. We then tested our mitochondrial hypothesis by applying MAGENTA to three gene sets: nuclear regulators of mitochondrial genes, oxidative phosphorylation genes, and ∼1,000 nuclear-encoded mitochondrial genes. The analysis was performed using the most recent T2D GWA meta-analysis of 47,117 people and meta-analyses of seven diabetes-related glycemic traits (up to 46,186 non-diabetic individuals). This well-powered analysis found no significant enrichment of associations to T2D or any of the glycemic traits in any of the gene sets tested. These results suggest that common variants affecting nuclear-encoded mitochondrial genes have at most a small genetic contribution to T2D susceptibility.
Author Summary
Mitochondria play a crucial role in metabolic homeostasis, and alteration of mitochondrial function is a hallmark of diabetes. While mitochondrial activity is reduced in people with diabetes, it is unclear whether mitochondrial dysfunction is a cause or effect of type 2 diabetes. Genome-wide association studies for type 2 diabetes have explained ≈10% of the heritability of the disease, but none of the loci are known to affect mitochondrial activity. It is possible though that a mitochondrial contribution is hidden in the remaining 90%. Hence, we tested the hypothesis that multiple mitochondria-related genes encoded in the nucleus, each having a weak effect (hard to detect individually), can collectively influence type 2 diabetes. To address this, we developed a computational method (MAGENTA) that allowed us to adequately analyze large collective datasets of human genetic variation obtained from collaborative studies of type 2 diabetes and related glycemic traits. Despite the increased sensitivity of MAGENTA compared to single-DNA variant analysis, we found no support for a causal relationship between mitochondrial dysfunction and type 2 diabetes. These results may help steer future efforts in understanding the pathogenesis of the disease. MAGENTA is broadly applicable to testing associations between other biological pathways and common diseases or traits.
PMCID: PMC2920848  PMID: 20714348
13.  1000 Genomes-based imputation identifies novel and refined associations for the Wellcome Trust Case Control Consortium phase 1 Data 
We hypothesize that imputation based on data from the 1000 Genomes Project can identify novel association signals on a genome-wide scale due to the dense marker map and the large number of haplotypes. To test the hypothesis, the Wellcome Trust Case Control Consortium (WTCCC) Phase I genotype data were imputed using 1000 genomes as reference (20100804 EUR), and seven case/control association studies were performed using imputed dosages. We observed two ‘missed' disease-associated variants that were undetectable by the original WTCCC analysis, but were reported by later studies after the 2007 WTCCC publication. One is within the IL2RA gene for association with type 1 diabetes and the other in proximity with the CDKN2B gene for association with type 2 diabetes. We also identified two refined associations. One is SNP rs11209026 in exon 9 of IL23R for association with Crohn's disease, which is predicted to be probably damaging by PolyPhen2. The other refined variant is in the CUX2 gene region for association with type 1 diabetes, where the newly identified top SNP rs1265564 has an association P-value of 1.68 × 10−16. The new lead SNP for the two refined loci provides a more plausible explanation for the disease association. We demonstrated that 1000 Genomes-based imputation could indeed identify both novel (in our case, ‘missed' because they were detected and replicated by studies after 2007) and refined signals. We anticipate the findings derived from this study to provide timely information when individual groups and consortia are beginning to engage in 1000 genomes-based imputation.
PMCID: PMC3376268  PMID: 22293688
genome-wide association study; the 1000 Genomes project; imputation
14.  Common and Rare Variant Analysis in Early-Onset Bipolar Disorder Vulnerability 
PLoS ONE  2014;9(8):e104326.
Bipolar disorder is one of the most common and devastating psychiatric disorders whose mechanisms remain largely unknown. Despite a strong genetic contribution demonstrated by twin and adoption studies, a polygenic background influences this multifactorial and heterogeneous psychiatric disorder. To identify susceptibility genes on a severe and more familial sub-form of the disease, we conducted a genome-wide association study focused on 211 patients of French origin with an early age at onset and 1,719 controls, and then replicated our data on a German sample of 159 patients with early-onset bipolar disorder and 998 controls. Replication study and subsequent meta-analysis revealed two genes encoding proteins involved in phosphoinositide signalling pathway (PLEKHA5 and PLCXD3). We performed additional replication studies in two datasets from the WTCCC (764 patients and 2,938 controls) and the GAIN-TGen cohorts (1,524 patients and 1,436 controls) and found nominal P-values both in the PLCXD3 and PLEKHA5 loci with the WTCCC sample. In addition, we identified in the French cohort one affected individual with a deletion at the PLCXD3 locus and another one carrying a missense variation in PLCXD3 (p.R93H), both supporting a role of the phosphatidylinositol pathway in early-onset bipolar disorder vulnerability. Although the current nominally significant findings should be interpreted with caution and need replication in independent cohorts, this study supports the strategy to combine genetic approaches to determine the molecular mechanisms underlying bipolar disorder.
PMCID: PMC4128749  PMID: 25111785
15.  Pathway-based analysis of primary biliary cirrhosis genome-wide association studies 
Genes and immunity  2013;14(3):179-186.
Genome-wide association studies (GWAS) have successfully identified several loci associated with primary biliary cirrhosis (PBC) risk. Pathway analysis complements conventional GWAS analysis. We applied the recently developed linear combination test for pathways to datasets drawn from independent PBC GWAS in Italian and Canadian subjects. Of the Kyoto Encyclopedia of Genes and Genomes and BioCarta pathways tested, 25 pathways in the Italian dataset (449 cases, 940 controls) and 26 pathways in the Canadian dataset (530 cases, 398 controls) were associated with PBC susceptibility (P < 0.05). After correcting for multiple comparisons, only the eight most significant pathways in the Italian dataset had FDR < 0.25 with tumor necrosis factor/stress-related signaling emerging as the top pathway (P = 7.38 × 10−4, FDR = 0.18). Two pathways, phosphatidylinositol signaling and hedgehog signaling, were replicated in both datasets (P < 0.05), and subjected to two additional complementary pathway tests. Both pathway signals remained significant in the Italian dataset on modified gene set enrichment analysis (P < 0.05). In both GWAS, variants nominally associated with PBC were significantly overrepresented in the phosphatidylinositol pathway (Fisher exact P < 0.05). These results point to established and novel pathway-level associations with inherited predisposition to PBC that on further independent replication and functional validation, may provide fresh insights into PBC etiology.
PMCID: PMC3780793  PMID: 23392275
linear combination test; phosphatidylinositol signaling; hedgehog signaling; autoimmune disease
16.  Mendelian Randomization Study of B-Type Natriuretic Peptide and Type 2 Diabetes: Evidence of Causal Association from Population Studies 
PLoS Medicine  2011;8(10):e1001112.
Using mendelian randomization, Roman Pfister and colleagues demonstrate a potentially causal link between low levels of B-type natriuretic peptide (BNP), a hormone released by damaged hearts, and the development of type 2 diabetes.
Genetic and epidemiological evidence suggests an inverse association between B-type natriuretic peptide (BNP) levels in blood and risk of type 2 diabetes (T2D), but the prospective association of BNP with T2D is uncertain, and it is unclear whether the association is confounded.
Methods and Findings
We analysed the association between levels of the N-terminal fragment of pro-BNP (NT-pro-BNP) in blood and risk of incident T2D in a prospective case-cohort study and genotyped the variant rs198389 within the BNP locus in three T2D case-control studies. We combined our results with existing data in a meta-analysis of 11 case-control studies. Using a Mendelian randomization approach, we compared the observed association between rs198389 and T2D to that expected from the NT-pro-BNP level to T2D association and the NT-pro-BNP difference per C allele of rs198389. In participants of our case-cohort study who were free of T2D and cardiovascular disease at baseline, we observed a 21% (95% CI 3%–36%) decreased risk of incident T2D per one standard deviation (SD) higher log-transformed NT-pro-BNP levels in analysis adjusted for age, sex, body mass index, systolic blood pressure, smoking, family history of T2D, history of hypertension, and levels of triglycerides, high-density lipoprotein cholesterol, and low-density lipoprotein cholesterol. The association between rs198389 and T2D observed in case-control studies (odds ratio = 0.94 per C allele, 95% CI 0.91–0.97) was similar to that expected (0.96, 0.93–0.98) based on the pooled estimate for the log-NT-pro-BNP level to T2D association derived from a meta-analysis of our study and published data (hazard ratio = 0.82 per SD, 0.74–0.90) and the difference in NT-pro-BNP levels (0.22 SD, 0.15–0.29) per C allele of rs198389. No significant associations were observed between the rs198389 genotype and potential confounders.
Our results provide evidence for a potential causal role of the BNP system in the aetiology of T2D. Further studies are needed to investigate the mechanisms underlying this association and possibilities for preventive interventions.
Please see later in the article for the Editors' Summary
Editors' Summary
Worldwide, nearly 250 million people have diabetes, and this number is increasing rapidly. Diabetes is characterized by dangerous amounts of sugar (glucose) in the blood. Blood sugar levels are normally controlled by insulin, a hormone that the pancreas releases after meals (digestion of food produces glucose). In people with type 2 diabetes (the most common form of diabetes), blood sugar control fails because the fat and muscle cells that usually respond to insulin by removing sugar from the blood become insulin resistant. Type 2 diabetes can be controlled with diet and exercise, and with drugs that help the pancreas make more insulin or that make cells more sensitive to insulin. The long-term complications of diabetes, which include kidney failure and an increased risk of cardiovascular problems such as heart disease and stroke, reduce the life expectancy of people with diabetes by about 10 years compared to people without diabetes.
Why Was This Study Done?
Because the causes of type 2 diabetes are poorly understood, it is hard to devise ways to prevent the condition. Recently, B-type natriuretic peptide (BNP, a hormone released by damaged hearts) has been implicated in type 2 diabetes development in cross-sectional studies (investigations in which data are collected at a single time point from a population to look for associations between an illness and potential risk factors). Although these studies suggest that high levels of BNP may protect against type 2 diabetes, they cannot prove a causal link between BNP levels and diabetes because the study participants with low BNP levels may share some another unknown factor (a confounding factor) that is the real cause of both diabetes and altered BNP levels. Here, the researchers use an approach called “Mendelian randomization” to examine whether reduced BNP levels contribute to causing type 2 diabetes. It is known that a common genetic variant (rs198389) within the genome region that encodes BNP is associated with a reduced risk of type 2 diabetes. Because gene variants are inherited randomly, they are not subject to confounding. So, by investigating the association between BNP gene variants that alter NT-pro-BNP (a molecule created when BNP is being produced) levels and the development of type 2 diabetes, the researchers can discover whether BNP is causally involved in this chronic condition.
What Did the Researchers Do and Find?
The researchers analyzed the association between blood levels of NT-pro-BNP at baseline in 440 participants of the EPIC-Norfolk study (a prospective population-based study of lifestyle factors and the risk of chronic diseases) who subsequently developed diabetes and in 740 participants who did not develop diabetes. In this prospective case-cohort study, the risk of developing type 2 diabetes was associated with lower NT-pro-BNP levels. They also genotyped (sequenced) rs198389 in the participants of three case-control studies of type 2 diabetes (studies in which potential risk factors for type 2 diabetes were examined in people with type 2 diabetes and matched controls living in the East of England), and combined these results with those of eight similar published case-control studies. Finally, the researchers showed that the association between rs198389 and type 2 diabetes measured in the case-control studies was similar to the expected association calculated from the association between NT-pro-BNP level and type 2 diabetes obtained from the prospective case-cohort study and the association between rs198389 and BNP levels obtained from the EPIC-Norfolk study and other published studies.
What Do These Findings Mean?
The results of this Mendelian randomization study provide evidence for a causal, protective role of the BNP hormone system in the development of type 2 diabetes. That is, these findings suggest that low levels of BNP are partly responsible for the development of type 2 diabetes. Because the participants in all the individual studies included in this analysis were of European descent, these findings may not be generalizable to other ethnicities. Moreover, they provide no explanation of how alterations in the BNP hormone system might affect the development of type 2 diabetes. Nevertheless, the demonstration of a causal link between the BNP hormone system and type 2 diabetes suggests that BNP may be a potential target for interventions designed to prevent type 2 diabetes, particularly since the feasibility of altering BNP levels with drugs has already been proven in patients with cardiovascular disease.
Additional Information
Please access these websites via the online version of this summary at
The International Diabetes Federation provides information about all aspects of diabetes
The US National Diabetes Information Clearinghouse provides detailed information about diabetes for patients, health-care professionals, and the general public (in English and Spanish)
The UK National Health Service Choices website also provides information for patients and carers about type 2 diabetes and includes people's stories about diabetes
MedlinePlus provides links to further resources and advice about diabetes (in English and Spanish)
Wikipedia has pages on BNP and on Mendelian randomization (note: Wikipedia is a free online encyclopedia that anyone can edit; available in several languages)
The charity Healthtalkonline has interviews with people about their experiences of diabetes; the charity Diabetes UK has a further selection of stories from people with diabetes
PMCID: PMC3201934  PMID: 22039354
17.  Pathway Analysis for Genome-Wide Association Study of Basal Cell Carcinoma of the Skin 
PLoS ONE  2011;6(7):e22760.
Recently, a pathway-based approach has been developed to evaluate the cumulative contribution of the functionally related genes for genome-wide association studies (GWASs), which may help utilize GWAS data to a greater extent.
In this study, we applied this approach for the GWAS of basal cell carcinoma (BCC) of the skin. We first conducted the BCC GWAS among 1,797 BCC cases and 5,197 controls in Caucasians with 740,760 genotyped SNPs. 115,688 SNPs were grouped into gene transcripts within 20 kb in distance and then into 174 Kyoto Encyclopedia of Genes and Genomes pathways, 205 BioCarta pathways, as well as two positive control gene sets (pigmentation gene set and BCC risk gene set). The association of each pathway with BCC risk was evaluated using the weighted Kolmogorov-Smirnov test. One thousand permutations were conducted to assess the significance.
Both of the positive control gene sets reached pathway p-values<0.05. Four other pathways were also significantly associated with BCC risk: the heparan sulfate biosynthesis pathway (p  =  0.007, false discovery rate, FDR  =  0.35), the mCalpain pathway (p  =  0.002, FDR  =  0.12), the Rho cell motility signaling pathway (p  =  0.011, FDR  =  0.30), and the nitric oxide pathway (p  =  0.022, FDR  =  0.42).
We identified four pathways associated with BCC risk, which may offer new insights into the etiology of BCC upon further validation, and this approach may help identify potential biological pathways that might be missed by the standard GWAS approach.
PMCID: PMC3145747  PMID: 21829505
18.  Genome-Wide Association Scan Allowing for Epistasis in Type 2 Diabetes 
Annals of human genetics  2010;75(1):10-19.
In the presence of epistasis multilocus association tests of human complex traits can provide powerful methods to detect susceptibility variants. We undertook multilocus analyses in 1924 type 2 diabetes cases and 2938 controls from the Wellcome Trust Case Control Consortium (WTCCC). We performed a two-dimensional genome-wide association (GWA) scan using joint two-locus tests of association including main and epistatic effects in 70,236 markers tagging common variants. We found two-locus association at 79 SNP-pairs at a Bonferroni-corrected P-value = 0.05 (uncorrected P-value = 2.14 × 10−11). The 79 pair-wise results always contained rs11196205 in TCF7L2 paired with 79 variants including confirmed variants in FTO, TSPAN8, and CDKAL1, which are associated in the absence of epistasis. However, the majority (82%) of the 79 variants did not have compelling single-locus association signals (P-value = 5 × 10−4). Analyses conditional on the single-locus effects at TCF7L2 established that the joint two-locus results could be attributed to single-locus association at TCF7L2 alone. Interaction analyses among the peak 80 regions and among 23 previously established diabetes candidate genes identified five SNP-pairs with case-control and case-only epistatic signals. Our results demonstrate the feasibility of systematic scans in GWA data, but confirm that single-locus association can underlie and obscure multilocus findings.
PMCID: PMC3430851  PMID: 21133856
Epistasis; simultaneous search; joint effects; genome-wide association
19.  Adiposity-Related Heterogeneity in Patterns of Type 2 Diabetes Susceptibility Observed in Genome-Wide Association Data 
Diabetes  2009;58(2):505-510.
OBJECTIVE—This study examined how differences in the BMI distribution of type 2 diabetic case subjects affected genome-wide patterns of type 2 diabetes association and considered the implications for the etiological heterogeneity of type 2 diabetes.
RESEARCH DESIGN AND METHODS—We reanalyzed data from the Wellcome Trust Case Control Consortium genome-wide association scan (1,924 case subjects, 2,938 control subjects: 393,453 single-nucleotide polymorphisms [SNPs]) after stratifying case subjects (into “obese” and “nonobese”) according to median BMI (30.2 kg/m2). Replication of signals in which alternative case-ascertainment strategies generated marked effect size heterogeneity in type 2 diabetes association signal was sought in additional samples.
RESULTS—In the “obese-type 2 diabetes” scan, FTO variants had the strongest type 2 diabetes effect (rs8050136: relative risk [RR] 1.49 [95% CI 1.34–1.66], P = 1.3 × 10−13), with only weak evidence for TCF7L2 (rs7901695 RR 1.21 [1.09–1.35], P = 0.001). This situation was reversed in the “nonobese” scan, with FTO association undetectable (RR 1.07 [0.97–1.19], P = 0.19) and TCF7L2 predominant (RR 1.53 [1.37–1.71], P = 1.3 × 10−14). These patterns, confirmed by replication, generated strong combined evidence for between-stratum effect size heterogeneity (FTO: PDIFF = 1.4 × 10−7; TCF7L2: PDIFF = 4.0 × 10−6). Other signals displaying evidence of effect size heterogeneity in the genome-wide analyses (on chromosomes 3, 12, 15, and 18) did not replicate. Analysis of the current list of type 2 diabetes susceptibility variants revealed nominal evidence for effect size heterogeneity for the SLC30A8 locus alone (RRobese 1.08 [1.01–1.15]; RRnonobese 1.18 [1.10–1.27]: PDIFF = 0.04).
CONCLUSIONS—This study demonstrates the impact of differences in case ascertainment on the power to detect and replicate genetic associations in genome-wide association studies. These data reinforce the notion that there is substantial etiological heterogeneity within type 2 diabetes.
PMCID: PMC2628627  PMID: 19056611
20.  Genome-Wide Association Study to Identify the Genetic Determinants of Otitis Media Susceptibility in Childhood 
PLoS ONE  2012;7(10):e48215.
Otitis media (OM) is a common childhood disease characterised by middle ear inflammation and effusion. Susceptibility to recurrent acute OM (rAOM; ≥3 episodes of AOM in 6 months) and chronic OM with effusion (COME; MEE ≥3 months) is 40–70% heritable. Few underlying genes have been identified to date, and no genome-wide association study (GWAS) of OM has been reported.
Methods and Findings
Data for 2,524,817 single nucleotide polymorphisms (SNPs; 535,544 quality-controlled SNPs genotyped by Illumina 660W-Quad; 1,989,273 by imputation) were analysed for association with OM in 416 cases and 1,075 controls from the Western Australian Pregnancy Cohort (Raine) Study. Logistic regression analyses under an additive model undertaken in GenABEL/ProbABEL adjusting for population substructure using principal components identified SNPs at CAPN14 (rs6755194: OR = 1.90; 95%CI 1.47–2.45; Padj-PCA = 8.3×10−7) on chromosome 2p23.1 as the top hit, with independent effects (rs1862981: OR = 1.60; 95%CI 1.29–1.99; Padj-PCA = 2.2×10−5) observed at the adjacent GALNT14 gene. In a gene-based analysis in VEGAS, BPIFA3 (PGene = 2×10−5) and BPIFA1 (PGene = 1.07×10−4) in the BPIFA gene cluster on chromosome 20q11.21 were the top hits. In all, 32 genomic regions show evidence of association (Padj-PCA<10−5) in this GWAS, with pathway analysis showing a connection between top candidates and the TGFβ pathway. However, top and tag-SNP analysis for seven selected candidate genes in this pathway did not replicate in 645 families (793 affected individuals) from the Western Australian Family Study of Otitis Media (WAFSOM). Lack of replication may be explained by sample size, difference in OM disease severity between primary and replication cohorts or due to type I error in the primary GWAS.
This first discovery GWAS for an OM phenotype has identified CAPN14 and GALNT14 on chromosome 2p23.1 and the BPIFA gene cluster on chromosome 20q11.21 as novel candidate genes which warrant further analysis in cohorts matched more precisely for clinical phenotypes.
PMCID: PMC3485007  PMID: 23133572
21.  Integration of disease-specific single nucleotide polymorphisms, expression quantitative trait loci and coexpression networks reveal novel candidate genes for type 2 diabetes 
Diabetologia  2012;55(8):2205-2213.
While genome-wide association studies (GWASs) have been successful in identifying novel variants associated with various diseases, it has been much more difficult to determine the biological mechanisms underlying these associations. Expression quantitative trait loci (eQTL) provide another dimension to these data by associating single nucleotide polymorphisms (SNPs) with gene expression. We hypothesised that integrating SNPs known to be associated with type 2 diabetes with eQTLs and coexpression networks would enable the discovery of novel candidate genes for type 2 diabetes.
We selected 32 SNPs associated with type 2 diabetes in two or more independent GWASs. We used previously described eQTLs mapped from genotype and gene expression data collected from 1,008 morbidly obese patients to find genes with expression associated with these SNPs. We linked these genes to coexpression modules, and ranked the other genes in these modules using an inverse sum score.
We found 62 genes with expression associated with type 2 diabetes SNPs. We validated our method by linking highly ranked genes in the coexpression modules back to SNPs through a combined eQTL dataset. We showed that the eQTLs highlighted by this method are significantly enriched for association with type 2 diabetes in data from the Wellcome Trust Case Control Consortium (WTCCC, p = 0.026) and the Gene Environment Association Studies (GENEVA, p = 0.042), validating our approach. Many of the highly ranked genes are also involved in the regulation or metabolism of insulin, glucose or lipids.
We have devised a novel method, involving the integration of datasets of different modalities, to discover novel candidate genes for type 2 diabetes.
PMCID: PMC3390705  PMID: 22584726
Genetics of type 2 diabetes; Genomics/proteomics; Mathematical modelling and simulation
22.  IPAD: the Integrated Pathway Analysis Database for Systematic Enrichment Analysis 
BMC Bioinformatics  2012;13(Suppl 15):S7.
Next-Generation Sequencing (NGS) technologies and Genome-Wide Association Studies (GWAS) generate millions of reads and hundreds of datasets, and there is an urgent need for a better way to accurately interpret and distill such large amounts of data. Extensive pathway and network analysis allow for the discovery of highly significant pathways from a set of disease vs. healthy samples in the NGS and GWAS. Knowledge of activation of these processes will lead to elucidation of the complex biological pathways affected by drug treatment, to patient stratification studies of new and existing drug treatments, and to understanding the underlying anti-cancer drug effects. There are approximately 141 biological human pathway resources as of Jan 2012 according to the Pathguide database. However, most currently available resources do not contain disease, drug or organ specificity information such as disease-pathway, drug-pathway, and organ-pathway associations. Systematically integrating pathway, disease, drug and organ specificity together becomes increasingly crucial for understanding the interrelationships between signaling, metabolic and regulatory pathway, drug action, disease susceptibility, and organ specificity from high-throughput omics data (genomics, transcriptomics, proteomics and metabolomics).
We designed the Integrated Pathway Analysis Database for Systematic Enrichment Analysis (IPAD,, defining inter-association between pathway, disease, drug and organ specificity, based on six criteria: 1) comprehensive pathway coverage; 2) gene/protein to pathway/disease/drug/organ association; 3) inter-association between pathway, disease, drug, and organ; 4) multiple and quantitative measurement of enrichment and inter-association; 5) assessment of enrichment and inter-association analysis with the context of the existing biological knowledge and a "gold standard" constructed from reputable and reliable sources; and 6) cross-linking of multiple available data sources.
IPAD is a comprehensive database covering about 22,498 genes, 25,469 proteins, 1956 pathways, 6704 diseases, 5615 drugs, and 52 organs integrated from databases including the BioCarta, KEGG, NCI-Nature curated, Reactome, CTD, PharmGKB, DrugBank, PharmGKB, and HOMER. The database has a web-based user interface that allows users to perform enrichment analysis from genes/proteins/molecules and inter-association analysis from a pathway, disease, drug, and organ.
Moreover, the quality of the database was validated with the context of the existing biological knowledge and a "gold standard" constructed from reputable and reliable sources. Two case studies were also presented to demonstrate: 1) self-validation of enrichment analysis and inter-association analysis on brain-specific markers, and 2) identification of previously undiscovered components by the enrichment analysis from a prostate cancer study.
IPAD is a new resource for analyzing, identifying, and validating pathway, disease, drug, organ specificity and their inter-associations. The statistical method we developed for enrichment and similarity measurement and the two criteria we described for setting the threshold parameters can be extended to other enrichment applications. Enriched pathways, diseases, drugs, organs and their inter-associations can be searched, displayed, and downloaded from our online user interface. The current IPAD database can help users address a wide range of biological pathway related, disease susceptibility related, drug target related and organ specificity related questions in human disease studies.
PMCID: PMC3439721  PMID: 23046449
23.  HPD: an online integrated human pathway database enabling systems biology studies 
BMC Bioinformatics  2009;10(Suppl 11):S5.
Pathway-oriented experimental and computational studies have led to a significant accumulation of biological knowledge concerning three major types of biological pathway events: molecular signaling events, gene regulation events, and metabolic reaction events. A pathway consists of a series of molecular pathway events that link molecular entities such as proteins, genes, and metabolites. There are approximately 300 biological pathway resources as of April 2009 according to the Pathguide database; however, these pathway databases generally have poor coverage or poor quality, and are difficult to integrate, due to syntactic-level and semantic-level data incompatibilities.
We developed the Human Pathway Database (HPD) by integrating heterogeneous human pathway data that are either curated at the NCI Pathway Interaction Database (PID), Reactome, BioCarta, KEGG or indexed from the Protein Lounge Web sites. Integration of pathway data at syntactic, semantic, and schematic levels was based on a unified pathway data model and data warehousing-based integration techniques. HPD provides a comprehensive online view that connects human proteins, genes, RNA transcripts, enzymes, signaling events, metabolic reaction events, and gene regulatory events. At the time of this writing HPD includes 999 human pathways and more than 59,341 human molecular entities. The HPD software provides both a user-friendly Web interface for online use and a robust relational database backend for advanced pathway querying. This pathway tool enables users to 1) search for human pathways from different resources by simply entering genes/proteins involved in pathways or words appearing in pathway names, 2) analyze pathway-protein association, 3) study pathway-pathway similarity, and 4) build integrated pathway networks. We demonstrated the usage and characteristics of the new HPD through three breast cancer case studies.
HPD is a new resource for searching, managing, and studying human biological pathways. Users of HPD can search against large collections of human biological pathways, compare related pathways and their molecular entity compositions, and build high-quality, expanded-scope disease pathway models. The current HPD software can help users address a wide range of pathway-related questions in human disease biology studies.
PMCID: PMC3226194  PMID: 19811689
24.  Stratifying Type 2 Diabetes Cases by BMI Identifies Genetic Risk Variants in LAMA1 and Enrichment for Risk Variants in Lean Compared to Obese Cases 
Perry, John R. B. | Voight, Benjamin F. | Yengo, Loïc | Amin, Najaf | Dupuis, Josée | Ganser, Martha | Grallert, Harald | Navarro, Pau | Li, Man | Qi, Lu | Steinthorsdottir, Valgerdur | Scott, Robert A. | Almgren, Peter | Arking, Dan E. | Aulchenko, Yurii | Balkau, Beverley | Benediktsson, Rafn | Bergman, Richard N. | Boerwinkle, Eric | Bonnycastle, Lori | Burtt, Noël P. | Campbell, Harry | Charpentier, Guillaume | Collins, Francis S. | Gieger, Christian | Green, Todd | Hadjadj, Samy | Hattersley, Andrew T. | Herder, Christian | Hofman, Albert | Johnson, Andrew D. | Kottgen, Anna | Kraft, Peter | Labrune, Yann | Langenberg, Claudia | Manning, Alisa K. | Mohlke, Karen L. | Morris, Andrew P. | Oostra, Ben | Pankow, James | Petersen, Ann-Kristin | Pramstaller, Peter P. | Prokopenko, Inga | Rathmann, Wolfgang | Rayner, William | Roden, Michael | Rudan, Igor | Rybin, Denis | Scott, Laura J. | Sigurdsson, Gunnar | Sladek, Rob | Thorleifsson, Gudmar | Thorsteinsdottir, Unnur | Tuomilehto, Jaakko | Uitterlinden, Andre G. | Vivequin, Sidonie | Weedon, Michael N. | Wright, Alan F. | Hu, Frank B. | Illig, Thomas | Kao, Linda | Meigs, James B. | Wilson, James F. | Stefansson, Kari | van Duijn, Cornelia | Altschuler, David | Morris, Andrew D. | Boehnke, Michael | McCarthy, Mark I. | Froguel, Philippe | Palmer, Colin N. A. | Wareham, Nicholas J. | Groop, Leif | Frayling, Timothy M. | Cauchi, Stéphane
PLoS Genetics  2012;8(5):e1002741.
Common diseases such as type 2 diabetes are phenotypically heterogeneous. Obesity is a major risk factor for type 2 diabetes, but patients vary appreciably in body mass index. We hypothesized that the genetic predisposition to the disease may be different in lean (BMI<25 Kg/m2) compared to obese cases (BMI≥30 Kg/m2). We performed two case-control genome-wide studies using two accepted cut-offs for defining individuals as overweight or obese. We used 2,112 lean type 2 diabetes cases (BMI<25 kg/m2) or 4,123 obese cases (BMI≥30 kg/m2), and 54,412 un-stratified controls. Replication was performed in 2,881 lean cases or 8,702 obese cases, and 18,957 un-stratified controls. To assess the effects of known signals, we tested the individual and combined effects of SNPs representing 36 type 2 diabetes loci. After combining data from discovery and replication datasets, we identified two signals not previously reported in Europeans. A variant (rs8090011) in the LAMA1 gene was associated with type 2 diabetes in lean cases (P = 8.4×10−9, OR = 1.13 [95% CI 1.09–1.18]), and this association was stronger than that in obese cases (P = 0.04, OR = 1.03 [95% CI 1.00–1.06]). A variant in HMG20A—previously identified in South Asians but not Europeans—was associated with type 2 diabetes in obese cases (P = 1.3×10−8, OR = 1.11 [95% CI 1.07–1.15]), although this association was not significantly stronger than that in lean cases (P = 0.02, OR = 1.09 [95% CI 1.02–1.17]). For 36 known type 2 diabetes loci, 29 had a larger odds ratio in the lean compared to obese (binomial P = 0.0002). In the lean analysis, we observed a weighted per-risk allele OR = 1.13 [95% CI 1.10–1.17], P = 3.2×10−14. This was larger than the same model fitted in the obese analysis where the OR = 1.06 [95% CI 1.05–1.08], P = 2.2×10−16. This study provides evidence that stratification of type 2 diabetes cases by BMI may help identify additional risk variants and that lean cases may have a stronger genetic predisposition to type 2 diabetes.
Author Summary
Individuals with Type 2 diabetes (T2D) can present with variable clinical characteristics. It is well known that obesity is a major risk factor for type 2 diabetes, yet patients can vary considerably—there are many lean diabetes patients and many overweight people without diabetes. We hypothesized that the genetic predisposition to the disease may be different in lean (BMI<25 Kg/m2) compared to obese cases (BMI≥30 Kg/m2). Specifically, as lean T2D patients had lower risk than obese patients, they must have been more genetically susceptible. Using genetic data from multiple genome-wide association studies, we tested genetic markers across the genome in 2,112 lean type 2 diabetes cases (BMI<25 kg/m2), 4,123 obese cases (BMI≥30 kg/m2), and 54,412 healthy controls. We confirmed our results in an additional 2,881 lean cases, 8,702 obese cases, and 18,957 healthy controls. Using these data we found differences in genetic enrichment between lean and obese cases, supporting our original hypothesis. We also searched for genetic variants that may be risk factors only in lean or obese patients and found two novel gene regions not previously reported in European individuals. These findings may influence future study design for type 2 diabetes and provide further insight into the biology of the disease.
PMCID: PMC3364960  PMID: 22693455
25.  Genome-wide meta-analysis of genetic susceptible genes for Type 2 Diabetes 
BMC Systems Biology  2012;6(Suppl 3):S16.
Many genetic studies, including single gene studies and Genome-wide association studies (GWAS), aim to identify risk alleles for genetic diseases such as Type II Diabetes (T2D). However, in T2D studies, there is a significant amount of the hereditary risk that cannot be simply explained by individual risk genes. There is a need for developing systems biology approaches to integrate comprehensive genetic information and provide new insight on T2D biology.
We performed comprehensive integrative analysis of Single Nucleotide Polymorphisms (SNP's) individually curated from T2D GWAS results and mapped them to T2D candidate risk genes. Using protein-protein interaction data, we constructed a T2D-specific molecular interaction network consisting of T2D genetic risk genes and their interacting gene partners. We then studied the relationship between these T2D genes and curated gene sets.
We determined that T2D candidate risk genes are concentrated in certain parts of the genome, specifically in chromosome 20. Using the T2D genetic network, we identified highly-interconnected network "hub" genes. By incorporating T2D GWAS results, T2D pathways, and T2D genes' functional category information, we further ranked T2D risk genes, T2D-related pathways, and T2D-related functional categories. We found that highly-interconnected T2D disease network “hub” genes most highly associated to T2D genetic risks to be PI3KR1, ESR1, and ENPP1. The well-characterized TCF7L2, contractor to our expectation, was not among the highest-ranked T2D gene list. Many interacted pathways play a role in T2D genetic risks, which includes insulin signalling pathway, type II diabetes pathway, maturity onset diabetes of the young, adipocytokine signalling pathway, and pathways in cancer. We also observed significant crosstalk among T2D gene subnetworks which include insulin secretion, regulation of insulin secretion, response to peptide hormone stimulus, response to insulin stimulus, peptide secretion, glucose homeostasis, and hormone transport. Overview maps involving T2D genes, gene sets, pathways, and their interactions are all reported.
Large-scale systems biology meta-analyses of GWAS results can improve interpretations of genetic variations and genetic risk factors. T2D genetic risks can be attributable to the summative genetic effects of many genes involved in a broad range of signalling pathways and functional networks. The framework developed for T2D studies may serve as a guide for studying other complex diseases.
PMCID: PMC3524015  PMID: 23281828

Results 1-25 (1189301)