Several studies have investigated associations between the -174G>C polymorphism (rs1800795) of the IL6-gene, but presented inconsistent results.
This joint analysis aimed to clarify whether IL6 -174G>C was associated with type 2 diabetes mellitus (T2DM) related quantitative phenotypes.
Individual-level data from all studies of the IL6-T2DM consortium on Caucasian subjects with available BMI were collected. As study-specific estimates did not show heterogeneity (P>0.1), they were combined by using the inverse-variance fixed-effect model.
The main analysis included 9440, 7398, 24,117, or 5659 nondiabetic and manifest T2DM subjects for fasting glucose, 2-hour glucose, BMI or circulating interleukin-6 levels, respectively. IL6 -174 C-allele carriers had significantly lower fasting glucose (−0.091mmol/L, P=0.014). There was no evidence for association between IL6 -174G>C and BMI or interleukin-6. In an additional analysis of 641 subjects known to develop T2DM later on, the IL6 -174 CC-genotype was associated with higher baseline interleukin-6 (+0.75pg/mL, P=0.004), which was consistent with higher interleukin-6 in the 966 manifest T2DM subjects (+0.50pg/mL, P=0.044).
Our data suggest association between IL6 -174G>C and quantitative glucose, and exploratory analysis indicated modulated interleukin-6 levels in pre-diabetic subjects, being in-line with this SNP’s previously reported T2DM association and a role of circulating interleukin-6 as intermediate phenotype.
blood glucose; body mass index; diabetes mellitus; type 2; epidemiology; molecular; genes; inflammation mediators; interleukin-6; intermediate phenotype; meta-analysis; polymorphism; single nucleotide
Serum metabolite concentrations provide a direct readout of biological processes in the human body, and are associated with disorders such as cardiovascular and metabolic diseases. Here we present a genome-wide association study with 163 metabolic traits using 1809 participants from the KORA population, followed up in the TwinsUK cohort with 422 participants. In eight out of nine replicated loci (FADS1, ELOVL2, ACADS, ACADM, ACADL, SPTLC3, ETFDH, SLC16A9) the genetic variant is located in or near enzyme or solute carrier coding genes, where the associating metabolic traits match the proteins’ function. Many of these loci are located in rate limiting steps of important enzymatic reactions. Use of metabolite concentration ratios as proxies for enzymatic reaction rates reduces the variance and yields robust statistical associations with p-values between 3×10−24 and 6.5×10−179. These loci explained 5.6% to 36.3% of the observed variance. For several loci, associations with clinically relevant parameters have previously been reported.
A genome-wide association study of educational attainment was conducted in a discovery sample of 101,069 individuals and a replication sample of 25,490. Three independent SNPs are genome-wide significant (rs9320913, rs11584700, rs4851266), and all three replicate. Estimated effects sizes are small (R2 ≈ 0.02%), approximately 1 month of schooling per allele. A linear polygenic score from all measured SNPs accounts for ≈ 2% of the variance in both educational attainment and cognitive function. Genes in the region of the loci have previously been associated with health, cognitive, and central nervous system phenotypes, and bioinformatics analyses suggest the involvement of the anterior caudate nucleus. These findings provide promising candidate SNPs for follow-up work, and our effect size estimates can anchor power analyses in social-science genetics.
A link between severe mental stress and shorter telomere length (TL) has been suggested. We analysed the impact of Posttraumatic Stress Disorder (PTSD) on TL in the general population and postulated a dose-dependent TL association in subjects suffering from partial PTSD compared to full PTSD.
Data are derived from the population-based KORA F4 study (2006–2008), located in southern Germany including 3,000 individuals (1,449 men and 1,551 women) with valid and complete TL data. Leukocyte TL was measured using a quantitative PCR-based technique. PTSD was assessed in a structured interview and by applying the Posttraumatic Diagnostic Scale (PDS) and the Impact of Event Scale (IES). A total of 262 (8.7%) subjects qualified for having partial PTSD and 51 (1.7%) for full PTSD. To assess the association of PTSD with the average TL, linear regression analyses with adjustments for potential confounding factors were performed.
The multiple model revealed a significant association between partial PTSD and TL (beta = −0.051, p = 0.009) as well as between full PTSD and shorter TL (beta = −0.103, p = 0.014) indicating shorter TL on average for partial and full PTSD. An additional adjustment for depression and depressed mood/exhaustion gave comparable beta estimations.
Participants with partial and full PTSD had significantly shorter leukocyte TL than participants without PTSD. The dose-dependent variation in TL of subjects with partial and full PTSD exceeded the chronological age effect, and was equivalent to an estimated 5 years in partial and 10 years in full PTSD of premature aging.
Background Human ageing is a complex, multifactorial process and early developmental factors affect health outcomes in old age.
Methods Metabolomic profiling on fasting blood was carried out in 6055 individuals from the UK. Stepwise regression was performed to identify a panel of independent metabolites which could be used as a surrogate for age. We also investigated the association with birthweight overall and within identical discordant twins and with genome-wide methylation levels.
Results We identified a panel of 22 metabolites which combined are strongly correlated with age (R2 = 59%) and with age-related clinical traits independently of age. One particular metabolite, C-glycosyl tryptophan (C-glyTrp), correlated strongly with age (beta = 0.03, SE = 0.001, P = 7.0 × 10−157) and lung function (FEV1 beta = −0.04, SE = 0.008, P = 1.8 × 10−8 adjusted for age and confounders) and was replicated in an independent population (n = 887). C-glyTrp was also associated with bone mineral density (beta = −0.01, SE = 0.002, P = 1.9 × 10−6) and birthweight (beta = −0.06, SE = 0.01, P = 2.5 × 10−9). The difference in C-glyTrp levels explained 9.4% of the variance in the difference in birthweight between monozygotic twins. An epigenome-wide association study in 172 individuals identified three CpG-sites, associated with levels of C-glyTrp (P < 2 × 10−6). We replicated one CpG site in the promoter of the WDR85 gene in an independent sample of 350 individuals (beta = −0.20, SE = 0.04, P = 2.9 × 10−8). WDR85 is a regulator of translation elongation factor 2, essential for protein synthesis in eukaryotes.
Conclusions Our data illustrate how metabolomic profiling linked with epigenetic studies can identify some key molecular mechanisms potentially determined in early development that produce long-term physiological changes influencing human health and ageing.
Ageing; metabolomics; epigenetics; twin studies; developmental origins of health and disease; birthweight
Environmental factors such as tobacco smoking may have long-lasting effects on DNA methylation patterns, which might lead to changes in gene expression and in a broader context to the development or progression of various diseases. We conducted an epigenome-wide association study (EWAs) comparing current, former and never smokers from 1793 participants of the population-based KORA F4 panel, with replication in 479 participants from the KORA F3 panel, carried out by the 450K BeadChip with genomic DNA obtained from whole blood. We observed wide-spread differences in the degree of site-specific methylation (with p-values ranging from 9.31E-08 to 2.54E-182) as a function of tobacco smoking in each of the 22 autosomes, with the percent of variance explained by smoking ranging from 1.31 to 41.02. Depending on cessation time and pack-years, methylation levels in former smokers were found to be close to the ones seen in never smokers. In addition, methylation-specific protein binding patterns were observed for cg05575921 within AHRR, which had the highest level of detectable changes in DNA methylation associated with tobacco smoking (–24.40% methylation; p = 2.54E-182), suggesting a regulatory role for gene expression. The results of our study confirm the broad effect of tobacco smoking on the human organism, but also show that quitting tobacco smoking presumably allows regaining the DNA methylation state of never smokers.
Economic variables such as income, education, and occupation are known to affect mortality and morbidity, such as cardiovascular disease, and have also been shown to be partly heritable. However, very little is known about which genes influence economic variables, although these genes may have both a direct and an indirect effect on health. We report results from the first large-scale collaboration that studies the molecular genetic architecture of an economic variable–entrepreneurship–that was operationalized using self-employment, a widely-available proxy. Our results suggest that common SNPs when considered jointly explain about half of the narrow-sense heritability of self-employment estimated in twin data (σg2/σP2 = 25%, h2 = 55%). However, a meta-analysis of genome-wide association studies across sixteen studies comprising 50,627 participants did not identify genome-wide significant SNPs. 58 SNPs with p<10−5 were tested in a replication sample (n = 3,271), but none replicated. Furthermore, a gene-based test shows that none of the genes that were previously suggested in the literature to influence entrepreneurship reveal significant associations. Finally, SNP-based genetic scores that use results from the meta-analysis capture less than 0.2% of the variance in self-employment in an independent sample (p≥0.039). Our results are consistent with a highly polygenic molecular genetic architecture of self-employment, with many genetic variants of small effect. Although self-employment is a multi-faceted, heavily environmentally influenced, and biologically distal trait, our results are similar to those for other genetically complex and biologically more proximate outcomes, such as height, intelligence, personality, and several diseases.
Through genome-wide association meta-analyses of up to 133,010 individuals of European ancestry without diabetes, including individuals newly genotyped using the Metabochip, we have raised the number of confirmed loci influencing glycemic traits to 53, of which 33 also increase type 2 diabetes risk (q < 0.05). Loci influencing fasting insulin showed association with lipid levels and fat distribution, suggesting impact on insulin resistance. Gene-based analyses identified further biologically plausible loci, suggesting that additional loci beyond those reaching genome-wide significance are likely to represent real associations. This conclusion is supported by an excess of directionally consistent and nominally significant signals between discovery and follow-up studies. Functional follow-up of these newly discovered loci will further improve our understanding of glycemic control.
Atopy and plasma IgE concentration are genetically complex traits, and the specific genetic risk factors that lead to IgE dysregulation and clinical atopy are an area of active investigation.
To ascertain the genetic risk factors which lead to IgE dysregulation.
A genome wide association study (GWAS) was performed in 6,819 participants from the Framingham Heart Study (FHS). Seventy of the top SNPs were selected based on p-values and linkage disequilibrium among neighboring SNPs and evaluated in a meta-analysis with five independent populations from the KORA, B58C, and CAMP cohorts.
Thirteen SNPs located in the region of three genes, FCER1A, STAT6, and IL-13, were found to have genome-wide significance in the FHS GWAS. The most significant SNPs from the three regions were rs2251746 (FCER1A, p-value 2.11×10-12), rs1059513 (STAT6, p-value 2.87×10-08), and rs1295686 (IL-13, p-value 3.55×10-08). Four additional gene regions - HLA-G, HLA-DQA2, HLA-A, and DARC - reached genome-wide statistical significance in meta-analysis combining FHS and replication cohorts, although the DARC association did not appear independent of SNPs in the nearby FCER1A gene.
This GWAS of the FHS has identified genetic loci in HLA genes that may have a role in the pathogenesis of IgE dysregulation and atopy. It also confirmed the association of known susceptibility loci, FCER1A, STAT6, and IL-13, for the dysregulation of total IgE.
total IgE; atopy; asthma; GWAS
Nuclear magnetic resonance spectroscopy (NMR) provides robust readouts of many metabolic parameters in one experiment. However, identification of clinically relevant markers in 1H NMR spectra is a major challenge. Association of NMR-derived quantities with genetic variants can uncover biologically relevant metabolic traits. Using NMR data of plasma samples from 1,757 individuals from the KORA study together with 655,658 genetic variants, we show that ratios between NMR intensities at two chemical shift positions can provide informative and robust biomarkers. We report seven loci of genetic association with NMR-derived traits (APOA1, CETP, CPS1, GCKR, FADS1, LIPC, PYROXD2) and characterize these traits biochemically using mass spectrometry. These ratios may now be used in clinical studies.
Recent advances in the identification of susceptibility genes and environmental exposures provide broad support for a post-infectious autoimmune basis for narcolepsy/hypocretin (orexin) deficiency. We genotyped loci associated with other autoimmune and inflammatory diseases in 1,886 individuals with hypocretin-deficient narcolepsy and 10,421 controls, all of European ancestry, using a custom genotyping array (ImmunoChip). Three loci located outside the Human Leukocyte Antigen (HLA) region on chromosome 6 were significantly associated with disease risk. In addition to a strong signal in the T cell receptor alpha (TRA@), variants in two additional narcolepsy loci, Cathepsin H (CTSH) and Tumor necrosis factor (ligand) superfamily member 4 (TNFSF4, also called OX40L), attained genome-wide significance. These findings underline the importance of antigen presentation by HLA Class II to T cells in the pathophysiology of this autoimmune disease.
While there is now broad consensus that narcolepsy-hypocretin deficiency results from a highly specific autoimmune attack on hypocretin cells, little is understood regarding the initiation and progression of the underlying autoimmune process. We have taken advantage of a unique high-density genotyping platform (the ImmunoChip) designed to study variants in genes known to be important to autoimmune and inflammatory diseases. Our study of nearly 2000 narcolepsy cases compared to 10,000 controls underscored important roles for HLA DQB1*06:02 and the T cell receptor alpha genes and implicated two additional genes, Cathepsin H and TNFSF4/OX40L, in disease pathogenesis. These findings are particularly important, as these encoded proteins have key roles in antigen processing, presentation, and T cell response, and they suggest that specific interactions at the immunological synapse constitute the pathway to the disease. Further studies of these genes and encoded proteins may therefore reveal the mechanism leading to this highly selective and unique autoimmune disease.
There are hints of an altered mitochondrial function in obesity. Nuclear-encoded genes are relevant for mitochondrial function (3 gene sets of known relevant pathways: (1) 16 nuclear regulators of mitochondrial genes, (2) 91 genes for oxidative phosphorylation and (3) 966 nuclear-encoded mitochondrial genes). Gene set enrichment analysis (GSEA) showed no association with type 2 diabetes mellitus in these gene sets. Here we performed a GSEA for the same gene sets for obesity. Genome wide association study (GWAS) data from a case-control approach on 453 extremely obese children and adolescents and 435 lean adult controls were used for GSEA. For independent confirmation, we analyzed 705 obesity GWAS trios (extremely obese child and both biological parents) and a population-based GWAS sample (KORA F4, n = 1,743). A meta-analysis was performed on all three samples. In each sample, the distribution of significance levels between the respective gene set and those of all genes was compared using the leading-edge-fraction-comparison test (cut-offs between the 50th and 95th percentile of the set of all gene-wise corrected p-values) as implemented in the MAGENTA software. In the case-control sample, significant enrichment of associations with obesity was observed above the 50th percentile for the set of the 16 nuclear regulators of mitochondrial genes (pGSEA,50 = 0.0103). This finding was not confirmed in the trios (pGSEA,50 = 0.5991), but in KORA (pGSEA,50 = 0.0398). The meta-analysis again indicated a trend for enrichment (pMAGENTA,50 = 0.1052, pMAGENTA,75 = 0.0251). The GSEA revealed that weak association signals for obesity might be enriched in the gene set of 16 nuclear regulators of mitochondrial genes.
Lipoprotein-associated phospholipase A2 (Lp-PLA2) generates proinflammatory and proatherogenic compounds in the arterial vascular wall and is a potential therapeutic target in coronary heart disease (CHD). We searched for genetic loci related to Lp-PLA2 mass or activity by a genome-wide association study as part of the Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) Consortium.
Methods and results
In meta-analyses of findings from five population-based studies, comprising 13 664 subjects, variants at two loci (PLA2G7, CETP) were associated with Lp-PLA2 mass. The strongest signal was at rs1805017 in PLA2G7 [P = 2.4 × 10−23, log Lp-PLA2 difference per allele (beta): 0.043]. Variants at six loci were associated with Lp-PLA2 activity (PLA2G7, APOC1, CELSR2, LDL, ZNF259, SCARB1), among which the strongest signals were at rs4420638, near the APOE–APOC1–APOC4–APOC2 cluster [P = 4.9 × 10−30; log Lp-PLA2 difference per allele (beta): −0.054]. There were no significant gene–environment interactions between these eight polymorphisms associated with Lp-PLA2 mass or activity and age, sex, body mass index, or smoking status. Four of the polymorphisms (in APOC1, CELSR2, SCARB1, ZNF259), but not PLA2G7, were significantly associated with CHD in a second study.
Levels of Lp-PLA2 mass and activity were associated with PLA2G7, the gene coding for this protein. Lipoprotein-associated phospholipase A2 activity was also strongly associated with genetic variants related to low-density lipoprotein cholesterol levels.
Genome-wide association; Inflammation; Lipoprotein-associated phospholipase A2
Microarray profiling of gene expression is widely applied in molecular biology and functional genomics. Experimental and technical variations make meta-analysis of different studies challenging. In a total of 3358 samples, all from German population-based cohorts, we investigated the effect of data preprocessing and the variability due to sample processing in whole blood cell and blood monocyte gene expression data, measured on the Illumina HumanHT-12 v3 BeadChip array.
Gene expression signal intensities were similar after applying the log2 or the variance-stabilizing transformation. In all cohorts, the first principal component (PC) explained more than 95% of the total variation. Technical factors substantially influenced signal intensity values, especially the Illumina chip assignment (33–48% of the variance), the RNA amplification batch (12–24%), the RNA isolation batch (16%), and the sample storage time, in particular the time between blood donation and RNA isolation for the whole blood cell samples (2–3%), and the time between RNA isolation and amplification for the monocyte samples (2%). White blood cell composition parameters were the strongest biological factors influencing the expression signal intensities in the whole blood cell samples (3%), followed by sex (1–2%) in both sample types. Known single nucleotide polymorphisms (SNPs) were located in 38% of the analyzed probe sequences and 4% of them included common SNPs (minor allele frequency >5%). Out of the tested SNPs, 1.4% significantly modified the probe-specific expression signals (Bonferroni corrected p-value<0.05), but in almost half of these events the signal intensities were even increased despite the occurrence of the mismatch. Thus, the vast majority of SNPs within probes had no significant effect on hybridization efficiency.
In summary, adjustment for a few selected technical factors greatly improved reliability of gene expression analyses. Such adjustments are particularly required for meta-analyses.
Concentrations of liver enzymes in plasma are widely used as indicators of liver disease. We carried out a genome-wide association study in 61,089 individuals, identifying 42 loci associated with concentrations of liver enzymes in plasma, of which 32 are new associations (P = 10−8 to P = 10−190). We used functional genomic approaches including metabonomic profiling and gene expression analyses to identify probable candidate genes at these regions. We identified 69 candidate genes, including genes involved in biliary transport (ATP8B1 and ABCB11), glucose, carbohydrate and lipid metabolism (FADS1, FADS2, GCKR, JMJD1C, HNF1A, MLXIPL, PNPLA3, PPP1R3B, SLC2A2 and TRIB1), glycoprotein biosynthesis and cell surface glycobiology (ABO, ASGR1, FUT2, GPLD1 and ST3GAL4), inflammation and immunity (CD276, CDH6, GCKR, HNF1A, HPR, ITGA1, RORA and STAT4) and glutathione metabolism (GSTT1, GSTT2 and GGT), as well as several genes of uncertain or unknown function (including ABHD12, EFHD1, EFNA1, EPHA2, MICAL3 and ZNF827). Our results provide new insight into genetic mechanisms and pathways influencing markers of liver function.
Recent genome-wide association studies (GWAS) with metabolomics data linked genetic variation in the human genome to differences in individual metabolite levels. A strong relevance of this metabolic individuality for biomedical and pharmaceutical research has been reported. However, a considerable amount of the molecules currently quantified by modern metabolomics techniques are chemically unidentified. The identification of these “unknown metabolites” is still a demanding and intricate task, limiting their usability as functional markers of metabolic processes. As a consequence, previous GWAS largely ignored unknown metabolites as metabolic traits for the analysis. Here we present a systems-level approach that combines genome-wide association analysis and Gaussian graphical modeling with metabolomics to predict the identity of the unknown metabolites. We apply our method to original data of 517 metabolic traits, of which 225 are unknowns, and genotyping information on 655,658 genetic variants, measured in 1,768 human blood samples. We report previously undescribed genotype–metabotype associations for six distinct gene loci (SLC22A2, COMT, CYP3A5, CYP2C18, GBA3, UGT3A1) and one locus not related to any known gene (rs12413935). Overlaying the inferred genetic associations, metabolic networks, and knowledge-based pathway information, we derive testable hypotheses on the biochemical identities of 106 unknown metabolites. As a proof of principle, we experimentally confirm nine concrete predictions. We demonstrate the benefit of our method for the functional interpretation of previous metabolomics biomarker studies on liver detoxification, hypertension, and insulin resistance. Our approach is generic in nature and can be directly transferred to metabolomics data from different experimental platforms.
Genome-wide association studies on metabolomics data have demonstrated that genetic variation in metabolic enzymes and transporters leads to concentration changes in the respective metabolite levels. The conventional goal of these studies is the detection of novel interactions between the genome and the metabolic system, providing valuable insights for both basic research as well as clinical applications. In this study, we borrow the metabolomics GWAS concept for a novel, entirely different purpose. Metabolite measurements frequently produce signals where a certain substance can be reliably detected in the sample, but it has not yet been elucidated which specific metabolite this signal actually represents. The concept is comparable to a fingerprint: each one is uniquely identifiable, but as long as it is not registered in a database one cannot tell to whom this fingerprint belongs. Obviously, this issue tremendously reduces the usability of a metabolomics analyses. The genetic associations of such an “unknown,” however, give us concrete evidence of the metabolic pathway this substance is most probably involved in. Moreover, we complement the approach with a specific measure of correlation between metabolites, providing further evidence of the metabolic processes of the unknown. For a number of cases, this even allows for a concrete identity prediction, which we then experimentally validate in the lab.
Psoriatic arthritis (PsA) is a chronic inflammatory musculoskeletal disease affecting up to 30% of psoriasis vulgaris (PsV) cases and approximately 0.25% to 1% of the general population. To identify common susceptibility loci, we performed a meta-analysis of three imputed genome-wide association studies (GWAS) on psoriasis, stratified for PsA. A total of 1,160,703 SNPs were analyzed in the discovery set consisting of 535 PsA cases and 3,432 controls from Germany, the United States and Canada. We followed up two SNPs in 1,931 PsA cases and 6,785 controls comprising six independent replication panels from Germany, Estonia, the United States and Canada. In the combined analysis, a genome-wide significant association was detected at 2p16 near the REL locus encoding c-Rel (rs13017599, P=1.18×10−8, OR=1.27, 95% CI=1.18–1.35). The rs13017599 polymorphism is known to associate with rheumatoid arthritis (RA), and another SNP near REL (rs702873) was recently implicated in PsV susceptibility. However, conditional analysis indicated that rs13017599, rather than rs702873, accounts for the PsA association at REL. We hypothesize that c-Rel, as a member of the Rel/NF-κB family, is associated with PsA in the context of disease pathways that involve other identified PsA and PsV susceptibility genes including TNIP1, TNFAIP3 and NFκBIA.
A targeted metabolomics approach was used to identify candidate biomarkers of pre-diabetes. The relevance of the identified metabolites is further corroborated with a protein-metabolite interaction network and gene expression data.
Three metabolites (glycine, lysophosphatidylcholine (LPC) (18:2) and acetylcarnitine C2) were found with significantly altered levels in pre-diabetic individuals compared with normal controls.Lower levels of glycine and LPC (18:2) were found to predict risks for pre-diabetes and type 2 diabetes (T2D).Seven T2D-related genes (PPARG, TCF7L2, HNF1A, GCK, IGF1, IRS1 and IDE) are functionally associated with the three identified metabolites.The unique combination of methodologies, including prospective population-based and nested case–control, as well as cross-sectional studies, was essential for the identification of the reported biomarkers.
Type 2 diabetes (T2D) can be prevented in pre-diabetic individuals with impaired glucose tolerance (IGT). Here, we have used a metabolomics approach to identify candidate biomarkers of pre-diabetes. We quantified 140 metabolites for 4297 fasting serum samples in the population-based Cooperative Health Research in the Region of Augsburg (KORA) cohort. Our study revealed significant metabolic variation in pre-diabetic individuals that are distinct from known diabetes risk indicators, such as glycosylated hemoglobin levels, fasting glucose and insulin. We identified three metabolites (glycine, lysophosphatidylcholine (LPC) (18:2) and acetylcarnitine) that had significantly altered levels in IGT individuals as compared to those with normal glucose tolerance, with P-values ranging from 2.4 × 10−4 to 2.1 × 10−13. Lower levels of glycine and LPC were found to be predictors not only for IGT but also for T2D, and were independently confirmed in the European Prospective Investigation into Cancer and Nutrition (EPIC)-Potsdam cohort. Using metabolite–protein network analysis, we identified seven T2D-related genes that are associated with these three IGT-specific metabolites by multiple interactions with four enzymes. The expression levels of these enzymes correlate with changes in the metabolite concentrations linked to diabetes. Our results may help developing novel strategies to prevent T2D.
early diagnostic biomarkers; IGT; metabolomics; prediction; T2D
Background & Aims
A limited number of genetic risk factors have been reported in primary sclerosing cholangitis (PSC). To discover further genetic susceptibility factors for PSC, we followed up on a second tier of single nucleotide polymorphisms (SNPs) from a genome-wide association study (GWAS).
We analyzed 45 SNPs in 1221 PSC cases and 3508 controls. The association results from the replication analysis and the original GWAS (715 PSC cases and 2962 controls) were combined in a meta-analysis comprising 1936 PSC cases and 6470 controls. We performed an analysis of bile microbial community composition in 39 PSC patients by 16S rRNA sequencing.
Seventeen SNPs representing 12 distinct genetic loci achieved nominal significance (Preplication<0.05) in the replication. The most robust novel association was detected at chromosome 1p36 (rs3748816; Pcombined=2.1×10−8) where the MMEL1 and TNFRSF14 genes represent potential disease genes. Eight additional novel loci showed suggestive evidence of association (Prepl<0.05). FUT2 at chromosome 19q13 (rs602662; Pcomb=1.9×10−6, rs281377; Pcomb = 2.1×10−6 and rs601338; Pcomb=2.7×10−6) is notable due to its implication in altered susceptibility to infectious agents. We found that FUT2 secretor status and genotype defined by rs601338 significantly influences biliary microbial community composition in PSC patients.
We identify multiple new PSC risk loci by extended analysis of a PSC GWAS. FUT2 genotype needs to be taken into account when assessing the influence from microbiota on biliary pathology in PSC.
primary sclerosing cholangitis; genome-wide association study; single nucleotide polymorphism; immunogenetics
Serum concentrations of low-density lipoprotein cholesterol (LDL-C), high-density lipoprotein cholesterol (HDL-C), triglycerides (TGs) and total cholesterol (TC) are important heritable risk factors for cardiovascular disease. Although genome-wide association studies (GWASs) of circulating lipid levels have identified numerous loci, a substantial portion of the heritability of these traits remains unexplained. Evidence of unexplained genetic variance can be detected by combining multiple independent markers into additive genetic risk scores. Such polygenic scores, constructed using results from the ENGAGE Consortium GWAS on serum lipids, were applied to predict lipid levels in an independent population-based study, the Rotterdam Study-II (RS-II). We additionally tested for evidence of a shared genetic basis for different lipid phenotypes. Finally, the polygenic score approach was used to identify an alternative genome-wide significance threshold before pathway analysis and those results were compared with those based on the classical genome-wide significance threshold. Our study provides evidence suggesting that many loci influencing circulating lipid levels remain undiscovered. Cross-prediction models suggested a small overlap between the polygenic backgrounds involved in determining LDL-C, HDL-C and TG levels. Pathway analysis utilizing the best polygenic score for TC uncovered extra information compared with using only genome-wide significant loci. These results suggest that the genetic architecture of circulating lipids involves a number of undiscovered variants with very small effects, and that increasing GWAS sample sizes will enable the identification of novel variants that regulate lipid levels.
serum lipids; polygenic; genome-wide association; polygenic score; pathway analysis
To characterise the influence of the fat free mass on the metabolite profile in serum samples from participants of the population-based KORA (Cooperative Health Research in the Region of Augsburg) S4 study.
Subjects and Methods
Analyses were based on metabolite profile from 965 participants of the S4 and 890 weight-stable subjects of its seven-year follow-up study (KORA F4). 190 different serum metabolites were quantified in a targeted approach including amino acids, acylcarnitines, phosphatidylcholines (PCs), sphingomyelins and hexose. Associations between metabolite concentrations and the fat free mass index (FFMI) were analysed using adjusted linear regression models. To draw conclusions on enzymatic reactions, intra-metabolite class ratios were explored. Pairwise relationships among metabolites were investigated and illustrated by means of Gaussian graphical models (GGMs).
We found 339 significant associations between FFMI and various metabolites in KORA S4. Among the most prominent associations (p-values 4.75×10−16–8.95×10−06) with higher FFMI were increasing concentrations of the branched chained amino acids (BCAAs), ratios of BCAAs to glucogenic amino acids, and carnitine concentrations. For various PCs, a decrease in chain length or in saturation of the fatty acid moieties could be observed with increasing FFMI, as well as an overall shift from acyl-alkyl PCs to diacyl PCs. These findings were reproduced in KORA F4. The established GGMs supported the regression results and provided a comprehensive picture of the relationships between metabolites. In a sub-analysis, most of the discovered associations did not exist in obese subjects in contrast to non-obese subjects, possibly indicating derangements in skeletal muscle metabolism.
A set of serum metabolites strongly associated with FFMI was identified and a network explaining the relationships among metabolites was established. These results offer a novel and more complete picture of the FFMI effects on serum metabolites in a data-driven network.
Genome-wide association studies (GWAS) with metabolic traits and metabolome-wide association studies (MWAS) with traits of biomedical relevance are powerful tools to identify the contribution of genetic, environmental and lifestyle factors to the etiology of complex diseases. Hypothesis-free testing of ratios between all possible metabolite pairs in GWAS and MWAS has proven to be an innovative approach in the discovery of new biologically meaningful associations. The p-gain statistic was introduced as an ad-hoc measure to determine whether a ratio between two metabolite concentrations carries more information than the two corresponding metabolite concentrations alone. So far, only a rule of thumb was applied to determine the significance of the p-gain.
Here we explore the statistical properties of the p-gain through simulation of its density and by sampling of experimental data. We derive critical values of the p-gain for different levels of correlation between metabolite pairs and show that B/(2*α) is a conservative critical value for the p-gain, where α is the level of significance and B the number of tested metabolite pairs.
We show that the p-gain is a well defined measure that can be used to identify statistically significant metabolite ratios in association studies and provide a conservative significance cut-off for the p-gain for use in future association studies with metabolic traits.
p-gain; Metabolomics; MWAS; GWAS; Genome-wide association studies; Metabolome-wide association studies
Common diseases such as type 2 diabetes are phenotypically heterogeneous. Obesity is a major risk factor for type 2 diabetes, but patients vary appreciably in body mass index. We hypothesized that the genetic predisposition to the disease may be different in lean (BMI<25 Kg/m2) compared to obese cases (BMI≥30 Kg/m2). We performed two case-control genome-wide studies using two accepted cut-offs for defining individuals as overweight or obese. We used 2,112 lean type 2 diabetes cases (BMI<25 kg/m2) or 4,123 obese cases (BMI≥30 kg/m2), and 54,412 un-stratified controls. Replication was performed in 2,881 lean cases or 8,702 obese cases, and 18,957 un-stratified controls. To assess the effects of known signals, we tested the individual and combined effects of SNPs representing 36 type 2 diabetes loci. After combining data from discovery and replication datasets, we identified two signals not previously reported in Europeans. A variant (rs8090011) in the LAMA1 gene was associated with type 2 diabetes in lean cases (P = 8.4×10−9, OR = 1.13 [95% CI 1.09–1.18]), and this association was stronger than that in obese cases (P = 0.04, OR = 1.03 [95% CI 1.00–1.06]). A variant in HMG20A—previously identified in South Asians but not Europeans—was associated with type 2 diabetes in obese cases (P = 1.3×10−8, OR = 1.11 [95% CI 1.07–1.15]), although this association was not significantly stronger than that in lean cases (P = 0.02, OR = 1.09 [95% CI 1.02–1.17]). For 36 known type 2 diabetes loci, 29 had a larger odds ratio in the lean compared to obese (binomial P = 0.0002). In the lean analysis, we observed a weighted per-risk allele OR = 1.13 [95% CI 1.10–1.17], P = 3.2×10−14. This was larger than the same model fitted in the obese analysis where the OR = 1.06 [95% CI 1.05–1.08], P = 2.2×10−16. This study provides evidence that stratification of type 2 diabetes cases by BMI may help identify additional risk variants and that lean cases may have a stronger genetic predisposition to type 2 diabetes.
Individuals with Type 2 diabetes (T2D) can present with variable clinical characteristics. It is well known that obesity is a major risk factor for type 2 diabetes, yet patients can vary considerably—there are many lean diabetes patients and many overweight people without diabetes. We hypothesized that the genetic predisposition to the disease may be different in lean (BMI<25 Kg/m2) compared to obese cases (BMI≥30 Kg/m2). Specifically, as lean T2D patients had lower risk than obese patients, they must have been more genetically susceptible. Using genetic data from multiple genome-wide association studies, we tested genetic markers across the genome in 2,112 lean type 2 diabetes cases (BMI<25 kg/m2), 4,123 obese cases (BMI≥30 kg/m2), and 54,412 healthy controls. We confirmed our results in an additional 2,881 lean cases, 8,702 obese cases, and 18,957 healthy controls. Using these data we found differences in genetic enrichment between lean and obese cases, supporting our original hypothesis. We also searched for genetic variants that may be risk factors only in lean or obese patients and found two novel gene regions not previously reported in European individuals. These findings may influence future study design for type 2 diabetes and provide further insight into the biology of the disease.
Pulmonary function measures reflect respiratory health and predict mortality, and are used in the diagnosis of chronic obstructive pulmonary disease (COPD). We tested genome-wide association with the forced expiratory volume in 1 second (FEV1) and the ratio of FEV1 to forced vital capacity (FVC) in 48,201 individuals of European ancestry, with follow-up of top associations in up to an additional 46,411 individuals. We identified new regions showing association (combined P<5×10−8) with pulmonary function, in or near MFAP2, TGFB2, HDAC4, RARB, MECOM (EVI1), SPATA9, ARMC2, NCR3, ZKSCAN3, CDC123, C10orf11, LRP1, CCDC38, MMP15, CFDP1, and KCNE2. Identification of these 16 new loci may provide insight into the molecular mechanisms regulating pulmonary function and into molecular targets for future therapy to alleviate reduced lung function.
Platelets are the second most abundant cell type in blood and are essential for maintaining haemostasis. Their count and volume are tightly controlled within narrow physiological ranges, but there is only limited understanding of the molecular processes controlling both traits. Here we carried out a high-powered meta-analysis of genome-wide association studies (GWAS) in up to 66,867 individuals of European ancestry, followed by extensive biological and functional assessment. We identified 68 genomic loci reliably associated with platelet count and volume mapping to established and putative novel regulators of megakaryopoiesis and platelet formation. These genes show megakaryocyte-specific gene expression patterns and extensive network connectivity. Using gene silencing in Danio rerio and Drosophila melanogaster, we identified 11 of the genes as novel regulators of blood cell formation. Taken together, our findings advance understanding of novel gene functions controlling fate-determining events during megakaryopoiesis and platelet formation, providing a new example of successful translation of GWAS to function.