Sequence-based variation in gene expression is a key driver of disease risk. Common variants regulating expression in cis have been mapped in many eQTL studies typically in single tissues from unrelated individuals. Here, we present a comprehensive analysis of gene expression across multiple tissues conducted in a large set of mono- and dizygotic twins that allows systematic dissection of genetic (cis and trans) and non-genetic effects on gene expression. Using identity-by-descent estimates, we show that at least 40% of the total heritable cis-effect on expression cannot be accounted for by common cis-variants, a finding which exposes the contribution of low frequency and rare regulatory variants with respect to both transcriptional regulation and complex trait susceptibility. We show that a substantial proportion of gene expression heritability is trans to the structural gene and identify several replicating trans-variants which act predominantly in a tissue-restricted manner and may regulate the transcription of many genes.
Serum metabolite concentrations provide a direct readout of biological processes in the human body, and are associated with disorders such as cardiovascular and metabolic diseases. Here we present a genome-wide association study with 163 metabolic traits using 1809 participants from the KORA population, followed up in the TwinsUK cohort with 422 participants. In eight out of nine replicated loci (FADS1, ELOVL2, ACADS, ACADM, ACADL, SPTLC3, ETFDH, SLC16A9) the genetic variant is located in or near enzyme or solute carrier coding genes, where the associating metabolic traits match the proteins’ function. Many of these loci are located in rate limiting steps of important enzymatic reactions. Use of metabolite concentration ratios as proxies for enzymatic reaction rates reduces the variance and yields robust statistical associations with p-values between 3×10−24 and 6.5×10−179. These loci explained 5.6% to 36.3% of the observed variance. For several loci, associations with clinically relevant parameters have previously been reported.
We conducted genome-wide association analyses of mean leukocyte telomere length in 2,917 subjects and follow-up replication analyses in 9,492 and identified a locus on 3q26 encompassing the telomerase RNA component TERC, with compelling evidence for association (rs12696304, combined P value 3.72×10−14). Each copy of the minor allele of rs12696304 was associated with ≈75 base pairs shorter mean telomere length equivalent to ≈3.6 years of age-related attrition of mean telomere length.
The red blood cell related traits are highly heritable but their genetics are poorly defined. Only 5–10% of the total observed variance is explained by the genetic loci found to date, suggesting that additional loci should be searched using approaches alternative to large meta analysis. GWAS (Genome Wide Association Study) for red blood cell traits in a founder population cohort from Northern Italy identified a new locus for mean corpuscular hemoglobin concentration (MCHC) in the TAF3 gene. The association was replicated in two cohorts (rs1887582, P = 4.25E–09). TAF3 encodes a transcription cofactor that participates in core promoter recognition complex, and is involved in zebrafish and mouse erythropoiesis. We show here that TAF3 is required for transcription of the SPTA1 gene, encoding alpha spectrin, one of the proteins that link the plasma membrane to the actin cytoskeleton. Mutations in SPTA1 are responsible for hereditary spherocytosis, a monogenic disorder of MCHC, as well as for the normal MCHC level. Based on our results, we propose that TAF3 is required for normal erythropoiesis in human and that it might have a role in controlling the ratio between hemoglobin (Hb) and cell volume and in the dynamics of RBC maturation in healthy individuals. Finally, TAF3 represents a potential candidate or a modifier gene for disorders of red cell membrane.
Genetic variants that associate with DNA methylation at CpG sites (methylation quantitative trait loci, meQTLs) offer a potential biological mechanism of action for disease associated SNPs. We investigated whether meQTLs exist in abdominal subcutaneous adipose tissue (SAT) and if CpG methylation associates with metabolic syndrome (MetSyn) phenotypes. We profiled 27,718 genomic regions in abdominal SAT samples of 38 unrelated individuals using differential methylation hybridization (DMH) together with genotypes at 5,227,243 SNPs and expression of 17,209 mRNA transcripts. Validation and replication of significant meQTLs was pursued in an independent cohort of 181 female twins. We find that, at 5% false discovery rate, methylation levels of 149 DMH regions associate with at least one SNP in a ±500 kilobase cis-region in our primary study. We sought to validate 19 of these in the replication study and find that five of these significantly associate with the corresponding meQTL SNPs from the primary study. We find that none of the 149 meQTL top SNPs is a significant expression quantitative trait locus in our expression data, but we observed association between expression levels of two mRNA transcripts and cis-methylation status. Our results indicate that DNA CpG methylation in abdominal SAT is partly under genetic control. This study provides a starting point for future investigations of DNA methylation in adipose tissue.
Sensitivity to pain varies considerably between individuals and is known to be heritable. Increased sensitivity to experimental pain is a risk factor for developing chronic pain, a common and debilitating but poorly understood symptom. To understand mechanisms underlying pain sensitivity and to search for rare gene variants (MAF<5%) influencing pain sensitivity, we explored the genetic variation in individuals' responses to experimental pain. Quantitative sensory testing to heat pain was performed in 2,500 volunteers from TwinsUK (TUK): exome sequencing to a depth of 70× was carried out on DNA from singletons at the high and low ends of the heat pain sensitivity distribution in two separate subsamples. Thus in TUK1, 101 pain-sensitive and 102 pain-insensitive were examined, while in TUK2 there were 114 and 96 individuals respectively. A combination of methods was used to test the association between rare variants and pain sensitivity, and the function of the genes identified was explored using network analysis. Using causal reasoning analysis on the genes with different patterns of SNVs by pain sensitivity status, we observed a significant enrichment of variants in genes of the angiotensin pathway (Bonferroni corrected p = 3.8×10−4). This pathway is already implicated in animal models and human studies of pain, supporting the notion that it may provide fruitful new targets in pain management. The approach of sequencing extreme exome variation in normal individuals has provided important insights into gene networks mediating pain sensitivity in humans and will be applicable to other common complex traits.
Chronic widespread pain is a complex clinical problem. Identification of underlying genetic factors would shed light on the biology of pain and offer targets for novel therapies. We aimed to identify rare genetic variants in the normal population associated with pain sensation by performing exome sequencing on individuals who were more or less sensitive to heat pain. While we did not identify any single variants having large effect, we did observe major group differences between the sensitive and insensitive individuals. Network analysis suggested a role for the angiotensin pathway, which previous work in animal models has suggested is important in pain mediation. Our results cast light on the genetic factors underlying normal pain sensation in humans and the utility of exome analyses. It suggests that further exploration of the angiotensin pathway may reveal novel targets for the treatment of pain.
To identify previously unknown genetic loci associated with fasting glucose concentrations, we examined the leading association signals in ten genome-wide association scans involving a total of 36,610 individuals of European descent. Variants in the gene encoding melatonin receptor 1B (MTNR1B) were consistently associated with fasting glucose across all ten studies. The strongest signal was observed at rs10830963, where each G allele (frequency 0.30 in HapMap CEU) was associated with an increase of 0.07 (95% CI = 0.06-0.08) mmol/l in fasting glucose levels (P = 3.2 = × 10−50) and reduced beta-cell function as measured by homeostasis model assessment (HOMA-B, P = 1.1 × 10−15). The same allele was associated with an increased risk of type 2 diabetes (odds ratio = 1.09 (1.05-1.12), per G allele P = 3.3 × 10−7) in a meta-analysis of 13 case-control studies totaling 18,236 cases and 64,453 controls. Our analyses also confirm previous associations of fasting glucose with variants at the G6PC2 (rs560887, P = 1.1 × 10−57) and GCK (rs4607517, P = 1.0 × 10−25) loci.
Genome-wide association studies have identified hundreds of loci for type 2 diabetes, coronary artery disease and myocardial infarction, as well as for related traits such as body mass index, glucose and insulin levels, lipid levels, and blood pressure. These studies also have pointed to thousands of loci with promising but not yet compelling association evidence. To establish association at additional loci and to characterize the genome-wide significant loci by fine-mapping, we designed the “Metabochip,” a custom genotyping array that assays nearly 200,000 SNP markers. Here, we describe the Metabochip and its component SNP sets, evaluate its performance in capturing variation across the allele-frequency spectrum, describe solutions to methodological challenges commonly encountered in its analysis, and evaluate its performance as a platform for genotype imputation. The metabochip achieves dramatic cost efficiencies compared to designing single-trait follow-up reagents, and provides the opportunity to compare results across a range of related traits. The metabochip and similar custom genotyping arrays offer a powerful and cost-effective approach to follow-up large-scale genotyping and sequencing studies and advance our understanding of the genetic basis of complex human diseases and traits.
Recent genetic studies have identified hundreds of regions of the human genome that contribute to risk for type 2 diabetes, coronary artery disease and myocardial infarction, and to related quantitative traits such as body mass index, glucose and insulin levels, blood lipid levels, and blood pressure. These results motivate two central questions: (1) can further genetic investigation identify additional associated regions?; and (2) can more detailed genetic investigation help us identify the causal variants (or variants more strongly correlated with the causal variants) in the regions identified so far? Addressing these questions requires assaying many genetic variants in DNA samples from thousands of individuals, which is expensive and timeconsuming when done a few SNPs at a time. To facilitate these investigations, we designed the “Metabochip,” a custom genotyping array that assays variation in nearly 200,000 sites in the human genome. Here we describe the Metabochip, evaluate its performance in assaying human genetic variation, and describe solutions to methodological challenges commonly encountered in its analysis.
Glycated hemoglobin (HbA1c), used to monitor and diagnose diabetes, is influenced by average glycemia over a 2- to 3-month period. Genetic factors affecting expression, turnover, and abnormal glycation of hemoglobin could also be associated with increased levels of HbA1c. We aimed to identify such genetic factors and investigate the extent to which they influence diabetes classification based on HbA1c levels.
RESEARCH DESIGN AND METHODS
We studied associations with HbA1c in up to 46,368 nondiabetic adults of European descent from 23 genome-wide association studies (GWAS) and 8 cohorts with de novo genotyped single nucleotide polymorphisms (SNPs). We combined studies using inverse-variance meta-analysis and tested mediation by glycemia using conditional analyses. We estimated the global effect of HbA1c loci using a multilocus risk score, and used net reclassification to estimate genetic effects on diabetes screening.
Ten loci reached genome-wide significant association with HbA1c, including six new loci near FN3K (lead SNP/P value, rs1046896/P = 1.6 × 10−26), HFE (rs1800562/P = 2.6 × 10−20), TMPRSS6 (rs855791/P = 2.7 × 10−14), ANK1 (rs4737009/P = 6.1 × 10−12), SPTA1 (rs2779116/P = 2.8 × 10−9) and ATP11A/TUBGCP3 (rs7998202/P = 5.2 × 10−9), and four known HbA1c loci: HK1 (rs16926246/P = 3.1 × 10−54), MTNR1B (rs1387153/P = 4.0 × 10−11), GCK (rs1799884/P = 1.5 × 10−20) and G6PC2/ABCB11 (rs552976/P = 8.2 × 10−18). We show that associations with HbA1c are partly a function of hyperglycemia associated with 3 of the 10 loci (GCK, G6PC2 and MTNR1B). The seven nonglycemic loci accounted for a 0.19 (% HbA1c) difference between the extreme 10% tails of the risk score, and would reclassify ∼2% of a general white population screened for diabetes with HbA1c.
GWAS identified 10 genetic loci reproducibly associated with HbA1c. Six are novel and seven map to loci where rarer variants cause hereditary anemias and iron storage disorders. Common variants at these loci likely influence HbA1c levels via erythrocyte biology, and confer a small but detectable reclassification of diabetes diagnosis by HbA1c.
Genome-wide association studies have identified many genetic variants associated with complex traits. However, at only a minority of loci have the molecular mechanisms mediating these associations been characterized. In parallel, whilst cis-regulatory patterns of gene expression have been extensively explored, the identification of trans-regulatory effects in humans has attracted less attention. We demonstrate that the Type 2 diabetes and HDL-cholesterol associated cis-acting eQTL of the maternally-expressed transcription factor KLF14 acts as a master trans-regulator of adipose gene expression. Expression levels of genes regulated by this trans-eQTL are highly-correlated with concurrently-measured metabolic traits, and a subset of the trans-genes harbor variants directly-associated with metabolic phenotypes. This trans-eQTL network provides a mechanistic understanding of the effect of the KLF14 locus on metabolic disease risk, providing a potential model for other complex traits.
Recent genome-wide association (GWA) studies described 95 loci controlling serum lipid levels. These common variants explain ∼25% of the heritability of the phenotypes. To date, no unbiased screen for gene–environment interactions for circulating lipids has been reported. We screened for variants that modify the relationship between known epidemiological risk factors and circulating lipid levels in a meta-analysis of genome-wide association (GWA) data from 18 population-based cohorts with European ancestry (maximum N = 32,225). We collected 8 further cohorts (N = 17,102) for replication, and rs6448771 on 4p15 demonstrated genome-wide significant interaction with waist-to-hip-ratio (WHR) on total cholesterol (TC) with a combined P-value of 4.79×10−9. There were two potential candidate genes in the region, PCDH7 and CCKAR, with differential expression levels for rs6448771 genotypes in adipose tissue. The effect of WHR on TC was strongest for individuals carrying two copies of G allele, for whom a one standard deviation (sd) difference in WHR corresponds to 0.19 sd difference in TC concentration, while for A allele homozygous the difference was 0.12 sd. Our findings may open up possibilities for targeted intervention strategies for people characterized by specific genomic profiles. However, more refined measures of both body-fat distribution and metabolic measures are needed to understand how their joint dynamics are modified by the newly found locus.
Circulating serum lipids contribute greatly to the global health by affecting the risk for cardiovascular diseases. Serum lipid levels are partly inherited, and already 95 loci affecting high- and low-density lipoprotein cholesterol, total cholesterol, and triglycerides have been found. Serum lipids are also known to be affected by multiple epidemiological risk factors like body composition, lifestyle, and sex. It has been hypothesized that there are loci modifying the effects between risk factors and serum lipids, but to date only candidate gene studies for interactions have been reported. We conducted a genome-wide screen with meta-analysis approach to identify loci having interactions with epidemiological risk factors on serum lipids with over 30,000 population-based samples. When combining results from our initial datasets and 8 additional replication cohorts (maximum N = 17,102), we found a genome-wide significant locus in chromosome 4p15 with a joint P-value of 4.79×10−9 modifying the effect of waist-to-hip ratio on total cholesterol. In the area surrounding this genetic variant, there were two genes having association between the genotypes and the gene expression in adipose tissue, and we also found enrichment of association in genes belonging to lipid metabolism related functions.
Glycated hemoglobin A1c (HbA1c) indicates the percentage of total hemoglobin that is bound by glucose, produced from the nonenzymatic chemical modification by glucose of hemoglobin molecules carried in erythrocytes. HbA1c represents a surrogate marker of average blood glucose concentration over the previous 8 to 12 weeks, or the average lifespan of the erythrocyte, and thus represents a more stable indicator of glycemic status compared with fasting glucose. HbA1c levels are genetically determined, with heritability of 47% to 59%. Over the past few years, inroads into understanding genetic predisposition by glycemic and nonglycemic factors have been achieved through genome-wide analyses. Here I review current research aimed at discovering genetic determinants of HbA1c levels, discussing insights into biologic factors influencing variability in the general and diabetic population, and across different ethnicities. Furthermore, I discuss briefly the relevance of findings for diabetes monitoring and diagnosis.
Glycated hemoglobin; HbA1c; Genome-wide association study; Single nucleotide polymorphism; Genetic; Diabetes
The integrated analysis of genotypic and expression data for association with complex traits could identify novel genetic pathways involved in complex traits. We profiled 19,573 expression probes in Epstein-Barr virus-transformed lymphoblastoid cell lines (LCLs) from 299 twins and correlated these with 44 quantitative traits (QTs). For 939 expressed probes correlating with more than one QT, we investigated the presence of eQTL associations in three datasets of 57 CEU HapMap founders and 86 unrelated twins. Genome-wide association analysis of these probes with 2.2 m SNPs revealed 131 potential eQTLs (1,989 eQTL SNPs) overlapping between the HapMap datasets, five of which were in cis (58 eQTL SNPs). We then tested 535 SNPs tagging the eQTL SNPs, for association with the relevant QT in 2,905 twins. We identified nine potential SNP-QT associations (P<0.01) but none significantly replicated in five large consortia of 1,097–16,129 subjects. We also failed to replicate previous reported eQTL associations with body mass index, plasma low-density lipoprotein cholesterol, high-density lipoprotein cholesterol and triglycerides levels derived from lymphocytes, adipose and liver tissue. Our results and additional power calculations suggest that proponents may have been overoptimistic in the power of LCLs in eQTL approaches to elucidate regulatory genetic effects on complex traits using the small datasets generated to date. Nevertheless, larger tissue-specific expression data sets relevant to specific traits are becoming available, and should enable the adoption of similar integrated analyses in the near future.
Turning genetic discoveries identified in genome-wide association (GWA) studies into biological mechanisms is an important challenge in human genetics. Many GWA signals map outside exons, suggesting that the associated variants may lie within regulatory regions. We applied the formaldehyde-assisted isolation of regulatory elements (FAIRE) method in a megakaryocytic and an erythroblastoid cell line to map active regulatory elements at known loci associated with hematological quantitative traits, coronary artery disease, and myocardial infarction. We showed that the two cell types exhibit distinct patterns of open chromatin and that cell-specific open chromatin can guide the finding of functional variants. We identified an open chromatin region at chromosome 7q22.3 in megakaryocytes but not erythroblasts, which harbors the common non-coding sequence variant rs342293 known to be associated with platelet volume and function. Resequencing of this open chromatin region in 643 individuals provided strong evidence that rs342293 is the only putative causative variant in this region. We demonstrated that the C- and G-alleles differentially bind the transcription factor EVI1 affecting PIK3CG gene expression in platelets and macrophages. A protein–protein interaction network including up- and down-regulated genes in Pik3cg knockout mice indicated that PIK3CG is associated with gene pathways with an established role in platelet membrane biogenesis and thrombus formation. Thus, rs342293 is the functional common variant at this locus; to the best of our knowledge this is the first such variant to be elucidated among the known platelet quantitative trait loci (QTLs). Our data suggested a molecular mechanism by which a non-coding GWA index SNP modulates platelet phenotype.
Genome-wide scans have revealed multiple genetic regions underlying complex traits. However, the transition from an initial association signal to identifying the functional DNA change(s) has proved challenging. Many of the DNA changes discovered are located outside protein-coding regions and may exert their effects through gene regulation. We screened genetic regions associated with hematological traits in erythroblasts (red blood cells) and megakaryocytes (platelet-producing cells) and mapped sites of open chromatin, which harbor active gene regulatory elements. We investigated a DNA sequence change located within a site of open chromatin at chromosome 7 in megakaryocytes, but not erythroblasts, known to be associated with platelet volume. We showed that this DNA change is functional due to alteration of the binding site of a transcription factor, which regulates the expression of a gene that affects platelet characteristics. Mice lacking this gene revealed significant differences in expression of several important platelet genes compared to wild-type mice. The approach described here can be applied in different cell types to functionally follow-up association signals with many other biological traits by identification of the causative base change and how it affects gene function, thus paving the way to clinical benefit.
The number and volume of cells in the blood affect a wide range of disorders including cancer and cardiovascular, metabolic, infectious and immune conditions. We consider here the genetic variation in eight clinically relevant hematological parameters, including hemoglobin levels, red and white blood cell counts and platelet counts and volume. We describe common variants within 22 genetic loci reproducibly associated with these hematological parameters in 13,943 samples from six European population-based studies, including 6 associated with red blood cell parameters, 15 associated with platelet parameters and 1 associated with total white blood cell count. We further identified a long-range haplotype at 12q24 associated with coronary artery disease in 9,479 cases and 10,527 controls. We show that this haplotype demonstrates extensive disease pleiotropy, as it contains known risk loci for type 1 diabetes, hypertension and celiac disease and has been spread by a selective sweep specific to European and geographically nearby populations.
Lung function measures are heritable traits that predict population morbidity and mortality and are essential for the diagnosis of chronic obstructive pulmonary disease (COPD). Variations in many genes have been reported to affect these traits, but attempts at replication have provided conflicting results. Recently, we undertook a meta-analysis of Genome Wide Association Study (GWAS) results for lung function measures in 20,288 individuals from the general population (the SpiroMeta consortium).
To comprehensively analyse previously reported genetic associations with lung function measures, and to investigate whether single nucleotide polymorphisms (SNPs) in these genomic regions are associated with lung function in a large population sample.
We analysed association for SNPs tagging 130 genes and 48 intergenic regions (+/−10 kb), after conducting a systematic review of the literature in the PubMed database for genetic association studies reporting lung function associations.
The analysis included 16,936 genotyped and imputed SNPs. No loci showed overall significant association for FEV1 or FEV1/FVC traits using a carefully defined significance threshold of 1.3×10−5. The most significant loci associated with FEV1 include SNPs tagging MACROD2 (P = 6.81×10−5), CNTN5 (P = 4.37×10−4), and TRPV4 (P = 1.58×10−3). Among ever-smokers, SERPINA1 showed the most significant association with FEV1 (P = 8.41×10−5), followed by PDE4D (P = 1.22×10−4). The strongest association with FEV1/FVC ratio was observed with ABCC1 (P = 4.38×10−4), and ESR1 (P = 5.42×10−4) among ever-smokers.
Polymorphisms spanning previously associated lung function genes did not show strong evidence for association with lung function measures in the SpiroMeta consortium population. Common SERPINA1 polymorphisms may affect FEV1 among smokers in the general population.
Smoking is a risk factor for most of the diseases leading in mortality1. We conducted genome-wide association (GWA) meta-analyses of smoking data within the ENGAGE consortium to search for common alleles associating with the number of cigarettes smoked per day (CPD) in smokers (N=31,266) and smoking initiation (N=46,481). We tested selected SNPs in a second stage (N=45,691 smokers), and assessed some in a third sample (N=9,040). Variants in three genomic regions associated with CPD (P< 5·10−8), including previously identified SNPs at 15q25 represented by rs1051730-A (0.80 CPD,P=2.4·10−69), and SNPs at 19q13 and 8p11, represented by rs4105144-C (0.39 CPD, P=2.2·10−12) and rs6474412-T (0.29 CPD,P= 1.4·10−8), respectively. Among the genes at the two novel loci, are genes encoding nicotine-metabolizing enzymes (CYP2A6 and CYP2B6), and nicotinic acetylcholine receptor subunits (CHRNB3 and CHRNA6) highlighted in previous studies of nicotine dependence2-3. Nominal associations with lung cancer were observed at both 8p11 (rs6474412-T,OR=1.09,P=0.04) and 19q13 (rs4105144-C,OR=1.12,P=0.0006).
Most common human traits and diseases have a polygenic pattern of inheritance: DNA sequence variants at many genetic loci influence phenotype. Genome-wide association (GWA) studies have identified >600 variants associated with human traits1, but these typically explain small fractions of phenotypic variation, raising questions about the utility of further studies. Here, using 183,727 individuals, we show that hundreds of genetic variants, in at least 180 loci, influence adult height, a highly heritable and classic polygenic trait2,3. The large number of loci reveals patterns with important implications for genetic studies of common human diseases and traits. First, the 180 loci are not random, but instead are enriched for genes that are connected in biological pathways (P=0.016), and that underlie skeletal growth defects (P<0.001). Second, the likely causal gene is often located near the most strongly associated variant: in 13 of 21 loci containing a known skeletal growth gene, that gene was closest to the associated variant. Third, at least 19 loci have multiple independently associated variants, suggesting that allelic heterogeneity is a frequent feature of polygenic traits, that comprehensive explorations of already-discovered loci should discover additional variants, and that an appreciable fraction of associated loci may have been identified. Fourth, associated variants are enriched for likely functional effects on genes, being over-represented amongst variants that alter amino acid structure of proteins and expression levels of nearby genes. Our data explain ∼10% of the phenotypic variation in height, and we estimate that unidentified common variants of similar effect sizes would increase this figure to ∼16% of phenotypic variation (∼20% of heritable variation). Although additional approaches are needed to fully dissect the genetic architecture of polygenic human traits, our findings indicate that GWA studies can identify large numbers of loci that implicate biologically relevant genes and pathways.
Dehydroepiandrosterone sulphate (DHEAS) is the most abundant circulating steroid secreted by adrenal glands—yet its function is unknown. Its serum concentration declines significantly with increasing age, which has led to speculation that a relative DHEAS deficiency may contribute to the development of common age-related diseases or diminished longevity. We conducted a meta-analysis of genome-wide association data with 14,846 individuals and identified eight independent common SNPs associated with serum DHEAS concentrations. Genes at or near the identified loci include ZKSCAN5 (rs11761528; p = 3.15×10−36), SULT2A1 (rs2637125; p = 2.61×10−19), ARPC1A (rs740160; p = 1.56×10−16), TRIM4 (rs17277546; p = 4.50×10−11), BMF (rs7181230; p = 5.44×10−11), HHEX (rs2497306; p = 4.64×10−9), BCL2L11 (rs6738028; p = 1.72×10−8), and CYP2C9 (rs2185570; p = 2.29×10−8). These genes are associated with type 2 diabetes, lymphoma, actin filament assembly, drug and xenobiotic metabolism, and zinc finger proteins. Several SNPs were associated with changes in gene expression levels, and the related genes are connected to biological pathways linking DHEAS with ageing. This study provides much needed insight into the function of DHEAS.
Dehydroepiandrosterone sulphate (DHEAS), mainly secreted by the adrenal gland, is the most abundant circulating steroid in humans. It shows a significant physiological decline after the age of 25 and diminishes about 95% by the age of 85 years, which has led to speculation that a relative DHEAS deficiency may contribute to the development of common age-related diseases or diminished longevity. Twin- and family-based studies have shown that there is a substantial genetic effect with heritability estimate of 60%, but no specific genes regulating serum DHEAS concentration have been identified to date. Here we take advantage of recent technical and methodological advances to examine the effects of common genetic variants on serum DHEAS concentrations. By examining 14,846 Caucasian individuals, we show that eight common genetic variants are associated with serum DHEAS concentrations. Genes at or near these genetic variants include BCL2L11, ARPC1A, ZKSCAN5, TRIM4, HHEX, CYP2C9, BMF, and SULT2A1. These genes have various associations with steroid hormone metabolism—co-morbidities of ageing including type 2 diabetes, lymphoma, actin filament assembly, drug and xenobiotic metabolism, and zinc finger proteins—suggesting a wider functional role for DHEAS than previously thought.
Plasma levels of coagulation factors VII (FVII), VIII (FVIII), and von Willebrand factor (vWF) influence risk of hemorrhage and thrombosis. We conducted genome-wide association studies to identify new loci associated with plasma levels.
Methods and Results
Setting includes 5 community-based studies for discovery comprising 23,608 European-ancestry participants: ARIC, CHS, B58C, FHS, and RS. All had genome-wide single nucleotide polymorphism (SNP) scans and at least 1 phenotype measured: FVII activity/antigen, FVIII activity, and vWF antigen. Each study used its genotype data to impute to HapMap SNPs and independently conducted association analyses of hemostasis measures using an additive genetic model. Study findings were combined by meta-analysis. Replication was conducted in 7,604 participants not in the discovery cohort. For FVII, 305 SNPs exceeded the genome-wide significance threshold of 5.0×10-8 and comprised 5 loci on 5 chromosomes: 2p23 (smallest p-value 6.2×10-24), 4q25 (3.6×10-12), 11q12 (2.0×10-10), 13q34 (9.0×10-259), and 20q11.2 (5.7×10-37). Loci were within or near genes, including 4 new candidate genes and F7 (13q34). For vWF, 400 SNPs exceeded the threshold and marked 8 loci on 6 chromosomes: 6q24 (1.2×10-22), 8p21 (1.3×10-16), 9q34 (<5.0×10-324), 12p13 (1.7×10-32), 12q23 (7.3×10-10), 12q24.3 (3.8×10-11), 14q32 (2.3×10-10) and 19p13.2 (1.3×10-9). All loci were within genes, including 6 new candidate genes, as well as ABO (9q34) and VWF (12p13). For FVIII, 5 loci were identified and overlapped vWF findings. Nine of the 10 new findings replicated.
New genetic associations were discovered outside previously known biologic pathways and may point to novel prevention and treatment targets of hemostasis disorders.
genome-wide variation; factor VII; factor VIII; von Willebrand factor; epidemiology; meta-analysis; thrombosis; hemostasis
While there have been studies exploring regulatory variation in one or more tissues, the complexity of tissue-specificity in multiple primary tissues is not yet well understood. We explore in depth the role of cis-regulatory variation in three human tissues: lymphoblastoid cell lines (LCL), skin, and fat. The samples (156 LCL, 160 skin, 166 fat) were derived simultaneously from a subset of well-phenotyped healthy female twins of the MuTHER resource. We discover an abundance of cis-eQTLs in each tissue similar to previous estimates (858 or 4.7% of genes). In addition, we apply factor analysis (FA) to remove effects of latent variables, thus more than doubling the number of our discoveries (1,822 eQTL genes). The unique study design (Matched Co-Twin Analysis—MCTA) permits immediate replication of eQTLs using co-twins (93%–98%) and validation of the considerable gain in eQTL discovery after FA correction. We highlight the challenges of comparing eQTLs between tissues. After verifying previous significance threshold-based estimates of tissue-specificity, we show their limitations given their dependency on statistical power. We propose that continuous estimates of the proportion of tissue-shared signals and direct comparison of the magnitude of effect on the fold change in expression are essential properties that jointly provide a biologically realistic view of tissue-specificity. Under this framework we demonstrate that 30% of eQTLs are shared among the three tissues studied, while another 29% appear exclusively tissue-specific. However, even among the shared eQTLs, a substantial proportion (10%–20%) have significant differences in the magnitude of fold change between genotypic classes across tissues. Our results underline the need to account for the complexity of eQTL tissue-specificity in an effort to assess consequences of such variants for complex traits.
Regulation of gene expression is a fundamental cellular process determining a large proportion of the phenotypic variance. Previous studies have identified genetic loci influencing gene expression levels (eQTLs), but the complexity of their tissue-specific properties has not yet been well-characterized. In this study, we perform cis-eQTL analysis in a unique matched co-twin design for three human tissues derived simultaneously from the same set of individuals. The study design allows validation of the substantial discoveries we make in each tissue. We explore in depth the tissue-dependent features of regulatory variants and estimate the proportions of shared and specific effects. We use continuous measures of eQTL sharing to circumvent the statistical power limitations of comparing direct overlap of eQTLs in multiple tissues. In this framework, we demonstrate that 30% of eQTLs are shared among tissues, while 29% are exclusively tissue-specific. Furthermore, we show that the fold change in expression between eQTL genotypic classes differs between tissues. Even among shared eQTLs, we report a substantial proportion (10%–20%) of significant tissue differences in magnitude of these effects. The complexities we highlight here are essential for understanding the impact of regulatory variants on complex traits.
Plasma adiponectin is strongly associated with various components of metabolic syndrome, type 2 diabetes and cardiovascular outcomes. Concentrations are highly heritable and differ between men and women. We therefore aimed to investigate the genetics of plasma adiponectin in men and women.
We combined genome-wide association scans of three population-based studies including 4659 persons. For the replication stage in 13795 subjects, we selected the 20 top signals of the combined analysis, as well as the 10 top signals with p-values less than 1.0*10-4 for each the men- and the women-specific analyses. We further selected 73 SNPs that were consistently associated with metabolic syndrome parameters in previous genome-wide association studies to check for their association with plasma adiponectin.
The ADIPOQ locus showed genome-wide significant p-values in the combined (p=4.3*10-24) as well as in both women- and men-specific analyses (p=8.7*10-17 and p=2.5*10-11, respectively). None of the other 39 top signal SNPs showed evidence for association in the replication analysis. None of 73 SNPs from metabolic syndrome loci exhibited association with plasma adiponectin (p>0.01).
We demonstrated the ADIPOQ gene as the only major gene for plasma adiponectin, which explains 6.7% of the phenotypic variance. We further found that neither this gene nor any of the metabolic syndrome loci explained the sex differences observed for plasma adiponectin. Larger studies are needed to identify more moderate genetic determinants of plasma adiponectin.
adiponectin; genome-wide association study; polymorphism; cardiovascular disease; metabolic syndrome
Synthetic associations have been posited as a possible explanation for missing heritability in complex disease. We show several lines of evidence which suggest that, while possible, these synthetic associations are not common.