PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-25 (41)
 

Clipboard (0)
None
Journals
Year of Publication
1.  RNAi–Based Functional Profiling of Loci from Blood Lipid Genome-Wide Association Studies Identifies Genes with Cholesterol-Regulatory Function 
PLoS Genetics  2013;9(2):e1003338.
Genome-wide association studies (GWAS) are powerful tools to unravel genomic loci associated with common traits and complex human disease. However, GWAS only rarely reveal information on the exact genetic elements and pathogenic events underlying an association. In order to extract functional information from genomic data, strategies for systematic follow-up studies on a phenotypic level are required. Here we address these limitations by applying RNA interference (RNAi) to analyze 133 candidate genes within 56 loci identified by GWAS as associated with blood lipid levels, coronary artery disease, and/or myocardial infarction for a function in regulating cholesterol levels in cells. Knockdown of a surprisingly high number (41%) of trait-associated genes affected low-density lipoprotein (LDL) internalization and/or cellular levels of free cholesterol. Our data further show that individual GWAS loci may contain more than one gene with cholesterol-regulatory functions. Using a set of secondary assays we demonstrate for a number of genes without previously known lipid-regulatory roles (e.g. CXCL12, FAM174A, PAFAH1B1, SEZ6L, TBL2, WDR12) that knockdown correlates with altered LDL–receptor levels and/or that overexpression as GFP–tagged fusion proteins inversely modifies cellular cholesterol levels. By providing strong evidence for disease-relevant functions of lipid trait-associated genes, our study demonstrates that quantitative, cell-based RNAi is a scalable strategy for a systematic, unbiased detection of functional effectors within GWAS loci.
Author Summary
Complex traits and diseases are assumed to result from interactions between multiple genes in relevant biological processes. Recent genome-wide association studies have uncovered many novel genomic loci where genes with functional significance are expected. However, functional validation of such genes has thus far remained confined to single gene approaches. Here, we use RNA interference and high-content screening microscopy to profile 133 genes at 56 loci associated with blood lipid traits, cardiovascular disease, and/or myocardial infarction for a function in regulating cellular free cholesterol levels and the efficiency of low-density lipoprotein uptake. Our results suggest that a high number of trait-associated genes have conserved cholesterol-regulatory functions in cells, with several GWAS loci harboring more than one gene of likely functional significance. For a number of genes without previously known lipid-regulatory functions, consequences upon siRNA knockdown positively correlated with cellular levels of LDL receptor, a major determinant of blood LDL levels. Moreover, GFP–tagged fusion proteins of several candidates shifted cellular cholesterol levels to inverse directions than knockdown, and subcellular localization of some candidates was sterol-dependent. Our study generates a valuable resource for prioritization of lipid-trait/CAD/MI-associated genes for future in-depth mechanistic analyses and introduces cell-based RNAi as a scalable and unbiased tool for functional follow-up of GWAS loci.
doi:10.1371/journal.pgen.1003338
PMCID: PMC3585126  PMID: 23468663
2.  Prediction and Interaction in Complex Disease Genetics: Experience in Type 1 Diabetes 
PLoS Genetics  2009;5(7):e1000540.
doi:10.1371/journal.pgen.1000540
PMCID: PMC2703795  PMID: 19584936
3.  A Systematic Mapping Approach of 16q12.2/FTO and BMI in More Than 20,000 African Americans Narrows in on the Underlying Functional Variation: Results from the Population Architecture using Genomics and Epidemiology (PAGE) Study 
PLoS Genetics  2013;9(1):e1003171.
Genetic variants in intron 1 of the fat mass– and obesity-associated (FTO) gene have been consistently associated with body mass index (BMI) in Europeans. However, follow-up studies in African Americans (AA) have shown no support for some of the most consistently BMI–associated FTO index single nucleotide polymorphisms (SNPs). This is most likely explained by different race-specific linkage disequilibrium (LD) patterns and lower correlation overall in AA, which provides the opportunity to fine-map this region and narrow in on the functional variant. To comprehensively explore the 16q12.2/FTO locus and to search for second independent signals in the broader region, we fine-mapped a 646–kb region, encompassing the large FTO gene and the flanking gene RPGRIP1L by investigating a total of 3,756 variants (1,529 genotyped and 2,227 imputed variants) in 20,488 AAs across five studies. We observed associations between BMI and variants in the known FTO intron 1 locus: the SNP with the most significant p-value, rs56137030 (8.3×10−6) had not been highlighted in previous studies. While rs56137030was correlated at r2>0.5 with 103 SNPs in Europeans (including the GWAS index SNPs), this number was reduced to 28 SNPs in AA. Among rs56137030 and the 28 correlated SNPs, six were located within candidate intronic regulatory elements, including rs1421085, for which we predicted allele-specific binding affinity for the transcription factor CUX1, which has recently been implicated in the regulation of FTO. We did not find strong evidence for a second independent signal in the broader region. In summary, this large fine-mapping study in AA has substantially reduced the number of common alleles that are likely to be functional candidates of the known FTO locus. Importantly our study demonstrated that comprehensive fine-mapping in AA provides a powerful approach to narrow in on the functional candidate(s) underlying the initial GWAS findings in European populations.
Author Summary
Genetic variants within the fat mass– and obesity-associated (FTO) gene are associated with increased risk of obesity. To better understand which specific genetic variant(s) in this genetic region is associated with obesity risk, we attempt to genotype or impute all known genetic variants in the region and test for association with body mass index as a measurement of obesity in over 20,000 African Americans. We identified 29 potential candidate variants, of which one variant (rs1421085) is a particularly interesting candidate for future functional follow-up studies. Our example shows the powerful approach of studying a large African American population, substantially reducing the number of possible functional variants compared with European descent populations.
doi:10.1371/journal.pgen.1003171
PMCID: PMC3547789  PMID: 23341774
4.  Integrative Analysis of a Cross-Loci Regulation Network Identifies App as a Gene Regulating Insulin Secretion from Pancreatic Islets 
PLoS Genetics  2012;8(12):e1003107.
Complex diseases result from molecular changes induced by multiple genetic factors and the environment. To derive a systems view of how genetic loci interact in the context of tissue-specific molecular networks, we constructed an F2 intercross comprised of >500 mice from diabetes-resistant (B6) and diabetes-susceptible (BTBR) mouse strains made genetically obese by the Leptinob/ob mutation (Lepob). High-density genotypes, diabetes-related clinical traits, and whole-transcriptome expression profiling in five tissues (white adipose, liver, pancreatic islets, hypothalamus, and gastrocnemius muscle) were determined for all mice. We performed an integrative analysis to investigate the inter-relationship among genetic factors, expression traits, and plasma insulin, a hallmark diabetes trait. Among five tissues under study, there are extensive protein–protein interactions between genes responding to different loci in adipose and pancreatic islets that potentially jointly participated in the regulation of plasma insulin. We developed a novel ranking scheme based on cross-loci protein-protein network topology and gene expression to assess each gene's potential to regulate plasma insulin. Unique candidate genes were identified in adipose tissue and islets. In islets, the Alzheimer's gene App was identified as a top candidate regulator. Islets from 17-week-old, but not 10-week-old, App knockout mice showed increased insulin secretion in response to glucose or a membrane-permeant cAMP analog, in agreement with the predictions of the network model. Our result provides a novel hypothesis on the mechanism for the connection between two aging-related diseases: Alzheimer's disease and type 2 diabetes.
Author Summary
Alzheimer's disease and type 2 diabetes are two common aging-related diseases. Numerous studies have shown that the two diseases are associated. However, the mechanisms of such connection are not clear. Both diseases are complex diseases that are induced by multiple genetic factors and the environment. To understand the molecular network regulated by complex genetic factors causing type 2 diabetes, we constructed an F2 intercross comprised of >500 mice from diabetes-resistant and diabetic mouse strains. We measured genotypes, clinical traits, and expression profiling in five tissues for each mouse. We then performed an integrative analysis to investigate the inter-relationship among genetic factors, expression traits, and plasma insulin, a hallmark diabetes trait, and developed a novel method for inferring key regulators for regulating plasma insulin. In islets, the Alzheimer's gene App was identified as a top candidate regulator. Islets from 17-week-old, but not 10-week-old, App knockout mice showed increased insulin secretion in response to glucose, in agreement with the predictions of the network model. Our result provides a novel hypothesis on the mechanism for the connection between two aging-related diseases: Alzheimer's disease and type 2 diabetes.
doi:10.1371/journal.pgen.1003107
PMCID: PMC3516550  PMID: 23236292
5.  Population-Based Resequencing of APOA1 in 10,330 Individuals: Spectrum of Genetic Variation, Phenotype, and Comparison with Extreme Phenotype Approach 
PLoS Genetics  2012;8(11):e1003063.
Rare genetic variants, identified by in-detail resequencing of loci, may contribute to complex traits. We used the apolipoprotein A-I gene (APOA1), a major high-density lipoprotein (HDL) gene, and population-based resequencing to determine the spectrum of genetic variants, the phenotypic characteristics of these variants, and how these results compared with results based on resequencing only the extremes of the apolipoprotein A-I (apoA-I) distribution. First, we resequenced APOA1 in 10,330 population-based participants in the Copenhagen City Heart Study. The spectrum and distribution of genetic variants was determined as a function of the number of individuals resequenced. Second, apoA-I and HDL cholesterol phenotypes were determined for nonsynonymous (NS) and synonymous (S) variants and were validated in the Copenhagen General Population Study (n = 45,239). Third, observed phenotypes were compared with those predicted using an extreme phenotype approach based on the apoA-I distribution. Our results are as follows: First, population-based resequencing of APOA1 identified 40 variants of which only 7 (18%) had minor allele frequencies >1%, and most were exceedingly rare. Second, 0.27% of individuals in the general population were heterozygous for NS variants which were associated with substantial reductions in apoA-I (up to 39 mg/dL) and/or HDL cholesterol (up to 0.9 mmol/L) and, surprisingly, 0.41% were heterozygous for variants predisposing to amyloidosis. NS variants associated with a hazard ratio of 1.72 (1.09–2.70) for myocardial infarction (MI), largely driven by A164S, a variant not associated with apoA-I or HDL cholesterol levels. Third, using the extreme apoA-I phenotype approach, NS variants correctly predicted the apoA-I phenotype observed in the population-based resequencing. However, using the extreme approach, between 79% (screening 0–1st percentile) and 21% (screening 0–20th percentile) of all variants were not identified; among these were variants previously associated with amyloidosis. Population-based resequencing of APOA1 identified a majority of rare NS variants associated with reduced apoA-1 and HDL cholesterol levels and/or predisposing to amyloidosis. In addition, NS variants associated with increased risk of MI.
Author Summary
Rare genetic variants, identified by in-detail resequencing of loci, may contribute to complex traits. We used the apolipoprotein A-I gene (APOA1), a major high-density lipoprotein (HDL) gene, and population-based resequencing to determine the spectrum of genetic variants, the phenotypic characteristics of these variants, and how these results compared with results based on resequencing only the extremes of the apolipoprotein A-I (apoA-I) distribution. By resequencing APOA1 in >10,000 Danes and genotyping an additional >45,000, we show that population-based resequencing of APOA1 identifies a majority of rare genetic variants that together are relatively frequent: 0.27% of the population are heterozygous for nonsynonymous (NS) variants in APOA1 that associate with substantial reductions in apoA-I and HDL cholesterol, and 0.41% are heterozygous for variants predisposing to amyloidosis. NS variants associated with a hazard ratio of 1.72 (1.09–2.70) for myocardial infarction (MI), largely driven by A164S, a variant not associated with apoA-I or HDL cholesterol levels. Resequencing only the extremes of the apoA-I distribution, between 79% and 21% of all variants are not identified; among these are variants previously associated with amyloidosis. These results provide direct evidence that rare NS variants in APOA1 contribute to low apoA-I and HDL cholesterol levels, to susceptibility to amyloidosis, and to risk of MI in the general population.
doi:10.1371/journal.pgen.1003063
PMCID: PMC3510059  PMID: 23209431
6.  Mining the Unknown: A Systems Approach to Metabolite Identification Combining Genetic and Metabolic Information 
PLoS Genetics  2012;8(10):e1003005.
Recent genome-wide association studies (GWAS) with metabolomics data linked genetic variation in the human genome to differences in individual metabolite levels. A strong relevance of this metabolic individuality for biomedical and pharmaceutical research has been reported. However, a considerable amount of the molecules currently quantified by modern metabolomics techniques are chemically unidentified. The identification of these “unknown metabolites” is still a demanding and intricate task, limiting their usability as functional markers of metabolic processes. As a consequence, previous GWAS largely ignored unknown metabolites as metabolic traits for the analysis. Here we present a systems-level approach that combines genome-wide association analysis and Gaussian graphical modeling with metabolomics to predict the identity of the unknown metabolites. We apply our method to original data of 517 metabolic traits, of which 225 are unknowns, and genotyping information on 655,658 genetic variants, measured in 1,768 human blood samples. We report previously undescribed genotype–metabotype associations for six distinct gene loci (SLC22A2, COMT, CYP3A5, CYP2C18, GBA3, UGT3A1) and one locus not related to any known gene (rs12413935). Overlaying the inferred genetic associations, metabolic networks, and knowledge-based pathway information, we derive testable hypotheses on the biochemical identities of 106 unknown metabolites. As a proof of principle, we experimentally confirm nine concrete predictions. We demonstrate the benefit of our method for the functional interpretation of previous metabolomics biomarker studies on liver detoxification, hypertension, and insulin resistance. Our approach is generic in nature and can be directly transferred to metabolomics data from different experimental platforms.
Author Summary
Genome-wide association studies on metabolomics data have demonstrated that genetic variation in metabolic enzymes and transporters leads to concentration changes in the respective metabolite levels. The conventional goal of these studies is the detection of novel interactions between the genome and the metabolic system, providing valuable insights for both basic research as well as clinical applications. In this study, we borrow the metabolomics GWAS concept for a novel, entirely different purpose. Metabolite measurements frequently produce signals where a certain substance can be reliably detected in the sample, but it has not yet been elucidated which specific metabolite this signal actually represents. The concept is comparable to a fingerprint: each one is uniquely identifiable, but as long as it is not registered in a database one cannot tell to whom this fingerprint belongs. Obviously, this issue tremendously reduces the usability of a metabolomics analyses. The genetic associations of such an “unknown,” however, give us concrete evidence of the metabolic pathway this substance is most probably involved in. Moreover, we complement the approach with a specific measure of correlation between metabolites, providing further evidence of the metabolic processes of the unknown. For a number of cases, this even allows for a concrete identity prediction, which we then experimentally validate in the lab.
doi:10.1371/journal.pgen.1003005
PMCID: PMC3475673  PMID: 23093944
7.  Extent, Causes, and Consequences of Small RNA Expression Variation in Human Adipose Tissue 
PLoS Genetics  2012;8(5):e1002704.
Small RNAs are functional molecules that modulate mRNA transcripts and have been implicated in the aetiology of several common diseases. However, little is known about the extent of their variability within the human population. Here, we characterise the extent, causes, and effects of naturally occurring variation in expression and sequence of small RNAs from adipose tissue in relation to genotype, gene expression, and metabolic traits in the MuTHER reference cohort. We profiled the expression of 15 to 30 base pair RNA molecules in subcutaneous adipose tissue from 131 individuals using high-throughput sequencing, and quantified levels of 591 microRNAs and small nucleolar RNAs. We identified three genetic variants and three RNA editing events. Highly expressed small RNAs are more conserved within mammals than average, as are those with highly variable expression. We identified 14 genetic loci significantly associated with nearby small RNA expression levels, seven of which also regulate an mRNA transcript level in the same region. In addition, these loci are enriched for variants significant in genome-wide association studies for body mass index. Contrary to expectation, we found no evidence for negative correlation between expression level of a microRNA and its target mRNAs. Trunk fat mass, body mass index, and fasting insulin were associated with more than twenty small RNA expression levels each, while fasting glucose had no significant associations. This study highlights the similar genetic complexity and shared genetic control of small RNA and mRNA transcripts, and gives a quantitative picture of small RNA expression variation in the human population.
Author Summary
Genetic information is transmitted to the cell only through RNA molecules. A special class of RNAs is comprised of the small (up to 30 nucleotide) ones, known to be potent regulators of various cellular processes. At the same time, they have not been as widely studied as messenger RNAs—we do not know how much variation in their sequence and expression level occurs naturally in human populations or how this variability influences other traits. We measured small RNA levels and genetic variability in fat tissue from 131 individuals by high-throughput sequencing. We could associate the expression levels with genetic background of the individuals, as well as changes in metabolic traits. Surprisingly, we found no large scale influence of small RNA variation on mRNA levels, their main regulatory target. Overall, our study is the first to give a quantitative picture of the naturally occurring variation in these important regulatory molecules in human fat tissue.
doi:10.1371/journal.pgen.1002704
PMCID: PMC3349731  PMID: 22589741
8.  Epigenome-Wide Scans Identify Differentially Methylated Regions for Age and Age-Related Phenotypes in a Healthy Ageing Population 
PLoS Genetics  2012;8(4):e1002629.
Age-related changes in DNA methylation have been implicated in cellular senescence and longevity, yet the causes and functional consequences of these variants remain unclear. To elucidate the role of age-related epigenetic changes in healthy ageing and potential longevity, we tested for association between whole-blood DNA methylation patterns in 172 female twins aged 32 to 80 with age and age-related phenotypes. Twin-based DNA methylation levels at 26,690 CpG-sites showed evidence for mean genome-wide heritability of 18%, which was supported by the identification of 1,537 CpG-sites with methylation QTLs in cis at FDR 5%. We performed genome-wide analyses to discover differentially methylated regions (DMRs) for sixteen age-related phenotypes (ap-DMRs) and chronological age (a-DMRs). Epigenome-wide association scans (EWAS) identified age-related phenotype DMRs (ap-DMRs) associated with LDL (STAT5A), lung function (WT1), and maternal longevity (ARL4A, TBX20). In contrast, EWAS for chronological age identified hundreds of predominantly hyper-methylated age DMRs (490 a-DMRs at FDR 5%), of which only one (TBX20) was also associated with an age-related phenotype. Therefore, the majority of age-related changes in DNA methylation are not associated with phenotypic measures of healthy ageing in later life. We replicated a large proportion of a-DMRs in a sample of 44 younger adult MZ twins aged 20 to 61, suggesting that a-DMRs may initiate at an earlier age. We next explored potential genetic and environmental mechanisms underlying a-DMRs and ap-DMRs. Genome-wide overlap across cis-meQTLs, genotype-phenotype associations, and EWAS ap-DMRs identified CpG-sites that had cis-meQTLs with evidence for genotype–phenotype association, where the CpG-site was also an ap-DMR for the same phenotype. Monozygotic twin methylation difference analyses identified one potential environmentally-mediated ap-DMR associated with total cholesterol and LDL (CSMD1). Our results suggest that in a small set of genes DNA methylation may be a candidate mechanism of mediating not only environmental, but also genetic effects on age-related phenotypes.
Author Summary
Epigenetic patterns vary during healthy ageing and development. Age-related DNA methylation changes have been implicated in cellular senescence and longevity, yet the causes and functional consequences of these variants remain unclear. To understand the biological mechanisms involved in potential longevity and rate of healthy ageing, we performed genome-wide association of epigenetic and genetic variation with both chronological age and age-related phenotypes. We identified hundreds of DNA methylation variants significantly associated with age and replicated these in an independent sample of young adult twins. Only a small proportion of these variants were also associated with age-related phenotypes. Therefore, the majority of age-related epigenetic changes do not contribute to rate of healthy ageing at later stages in life. Our results suggest that age-related changes in methylation occur throughout an individual's lifespan and that a proportion of these may be initiated from an early age. Intriguingly, a fraction of the age differentially methylated regions also associated with genetic variants in our sample, suggesting that DNA methylation may be a candidate mechanism of mediating not only environmental but also genetic effects on age-related phenotypes.
doi:10.1371/journal.pgen.1002629
PMCID: PMC3330116  PMID: 22532803
9.  The Human Pancreatic Islet Transcriptome: Expression of Candidate Genes for Type 1 Diabetes and the Impact of Pro-Inflammatory Cytokines 
PLoS Genetics  2012;8(3):e1002552.
Type 1 diabetes (T1D) is an autoimmune disease in which pancreatic beta cells are killed by infiltrating immune cells and by cytokines released by these cells. Signaling events occurring in the pancreatic beta cells are decisive for their survival or death in diabetes. We have used RNA sequencing (RNA–seq) to identify transcripts, including splice variants, expressed in human islets of Langerhans under control conditions or following exposure to the pro-inflammatory cytokines interleukin-1β (IL-1β) and interferon-γ (IFN-γ). Based on this unique dataset, we examined whether putative candidate genes for T1D, previously identified by GWAS, are expressed in human islets. A total of 29,776 transcripts were identified as expressed in human islets. Expression of around 20% of these transcripts was modified by pro-inflammatory cytokines, including apoptosis- and inflammation-related genes. Chemokines were among the transcripts most modified by cytokines, a finding confirmed at the protein level by ELISA. Interestingly, 35% of the genes expressed in human islets undergo alternative splicing as annotated in RefSeq, and cytokines caused substantial changes in spliced transcripts. Nova1, previously considered a brain-specific regulator of mRNA splicing, is expressed in islets and its knockdown modified splicing. 25/41 of the candidate genes for T1D are expressed in islets, and cytokines modified expression of several of these transcripts. The present study doubles the number of known genes expressed in human islets and shows that cytokines modify alternative splicing in human islet cells. Importantly, it indicates that more than half of the known T1D candidate genes are expressed in human islets. This, and the production of a large number of chemokines and cytokines by cytokine-exposed islets, reinforces the concept of a dialog between pancreatic islets and the immune system in T1D. This dialog is modulated by candidate genes for the disease at both the immune system and beta cell level.
Author Summary
Pancreatic beta cells are destroyed by the immune system in type 1 diabetes mellitus, causing insulin dependence for life. Candidate genes for diabetes contribute to this process by acting both at the immune system and, as we suggest here, at the pancreatic beta cell level. We have utilized a novel technology, RNA sequencing, to define all transcripts expressed in human pancreatic islets under basal conditions and following exposure to cytokines, pro-inflammatory mediators that contribute to trigger diabetes. Our observations double the number of known genes present in human islets and indicate that >60% of the candidate genes for type 1 diabetes are expressed in beta cells. The data also show that pro-inflammatory cytokines modify alternative splicing in human islets, a process that may generate novel RNAs and proteins recognizable by the immune system. This, taken together with the findings that pancreatic beta cells themselves express and release many cytokines and chemokines (proteins that attract immune cells), indicates that early type 1 diabetes is characterized by a dialog between beta cells and the immune system. We suggest that candidate genes for diabetes function at least in part as “writers” for the beta cell words in this dialog.
doi:10.1371/journal.pgen.1002552
PMCID: PMC3297576  PMID: 22412385
10.  Coexpression Network Analysis in Abdominal and Gluteal Adipose Tissue Reveals Regulatory Genetic Loci for Metabolic Syndrome and Related Phenotypes 
PLoS Genetics  2012;8(2):e1002505.
Metabolic Syndrome (MetS) is highly prevalent and has considerable public health impact, but its underlying genetic factors remain elusive. To identify gene networks involved in MetS, we conducted whole-genome expression and genotype profiling on abdominal (ABD) and gluteal (GLU) adipose tissue, and whole blood (WB), from 29 MetS cases and 44 controls. Co-expression network analysis for each tissue independently identified nine, six, and zero MetS–associated modules of coexpressed genes in ABD, GLU, and WB, respectively. Of 8,992 probesets expressed in ABD or GLU, 685 (7.6%) were expressed in ABD and 51 (0.6%) in GLU only. Differential eigengene network analysis of 8,256 shared probesets detected 22 shared modules with high preservation across adipose depots (DABD-GLU = 0.89), seven of which were associated with MetS (FDR P<0.01). The strongest associated module, significantly enriched for immune response–related processes, contained 94/620 (15%) genes with inter-depot differences. In an independent cohort of 145/141 twins with ABD and WB longitudinal expression data, median variability in ABD due to familiality was greater for MetS–associated versus un-associated modules (ABD: 0.48 versus 0.18, P = 0.08; GLU: 0.54 versus 0.20, P = 7.8×10−4). Cis-eQTL analysis of probesets associated with MetS (FDR P<0.01) and/or inter-depot differences (FDR P<0.01) provided evidence for 32 eQTLs. Corresponding eSNPs were tested for association with MetS–related phenotypes in two GWAS of >100,000 individuals; rs10282458, affecting expression of RARRES2 (encoding chemerin), was associated with body mass index (BMI) (P = 6.0×10−4); and rs2395185, affecting inter-depot differences of HLA-DRB1 expression, was associated with high-density lipoprotein (P = 8.7×10−4) and BMI–adjusted waist-to-hip ratio (P = 2.4×10−4). Since many genes and their interactions influence complex traits such as MetS, integrated analysis of genotypes and coexpression networks across multiple tissues relevant to clinical traits is an efficient strategy to identify novel associations.
Author Summary
Metabolic Syndrome (MetS) is a highly prevalent disorder with considerable public health concern, but its underlying genetic factors remain elusive. Given that most cellular components exert their functions through interactions with other cellular components, even the largest of genome-wide association (GWA) studies may often not detect their effects, nor necessarily provide insight into the complex molecular mechanisms of the disease. Rather than focusing on individual genes, the analysis of coexpression networks can be used for finding clusters (modules) of correlated expression levels across samples. In this study, we used a gene network–based approach for integrating clinical MetS, genotypic, and gene expression data from abdominal and gluteal adipose tissue and whole blood. We identified modules of genes related to MetS significantly enriched for immune response and oxidative phosphorylation pathways. We tested SNPs for association with MetS–associated expression (eSNPs), and tested prioritised eSNPs for association with MetS–related phenotypes in two large-scale GWA datasets. We identified two loci, neither of which had reached genome-wide significance levels in GWAs, associated with expression levels of RARRES2 and HLA-DRB1 and with MetS–related phenotypes, demonstrating that the integrated analysis of genotype and expression data from relevant multiple tissues can identify novel associations with complex traits such as MetS.
doi:10.1371/journal.pgen.1002505
PMCID: PMC3285582  PMID: 22383892
11.  Genome-Wide Association Study Identifies Chromosome 10q24.32 Variants Associated with Arsenic Metabolism and Toxicity Phenotypes in Bangladesh 
PLoS Genetics  2012;8(2):e1002522.
Arsenic contamination of drinking water is a major public health issue in many countries, increasing risk for a wide array of diseases, including cancer. There is inter-individual variation in arsenic metabolism efficiency and susceptibility to arsenic toxicity; however, the basis of this variation is not well understood. Here, we have performed the first genome-wide association study (GWAS) of arsenic-related metabolism and toxicity phenotypes to improve our understanding of the mechanisms by which arsenic affects health. Using data on urinary arsenic metabolite concentrations and approximately 300,000 genome-wide single nucleotide polymorphisms (SNPs) for 1,313 arsenic-exposed Bangladeshi individuals, we identified genome-wide significant association signals (P<5×10−8) for percentages of both monomethylarsonic acid (MMA) and dimethylarsinic acid (DMA) near the AS3MT gene (arsenite methyltransferase; 10q24.32), with five genetic variants showing independent associations. In a follow-up analysis of 1,085 individuals with arsenic-induced premalignant skin lesions (the classical sign of arsenic toxicity) and 1,794 controls, we show that one of these five variants (rs9527) is also associated with skin lesion risk (P = 0.0005). Using a subset of individuals with prospectively measured arsenic (n = 769), we show that rs9527 interacts with arsenic to influence incident skin lesion risk (P = 0.01). Expression quantitative trait locus (eQTL) analyses of genome-wide expression data from 950 individual's lymphocyte RNA suggest that several of our lead SNPs represent cis-eQTLs for AS3MT (P = 10−12) and neighboring gene C10orf32 (P = 10−44), which are involved in C10orf32-AS3MT read-through transcription. This is the largest and most comprehensive genomic investigation of arsenic metabolism and toxicity to date, the only GWAS of any arsenic-related trait, and the first study to implicate 10q24.32 variants in both arsenic metabolism and arsenical skin lesion risk. The observed patterns of associations suggest that MMA% and DMA% have distinct genetic determinants and support the hypothesis that DMA is the less toxic of these two methylated arsenic species. These results have potential translational implications for the prevention and treatment of arsenic-associated toxicities worldwide.
Author Summary
Exposure to arsenic through drinking water is a serious public health issue in many countries, including Bangladesh and the United States. Although there is substantial inter-individual variation in arsenic metabolism and toxicity, the biological basis of this variation is not well understood. Here, we have conducted the first genome-wide association study of arsenic-related traits within a unique population cohort of arsenic-exposed Bangladeshi individuals. Using data on 1,313 well-characterized individuals, we identify multiple association signals for urinary arsenic metabolite concentrations in the 10q24.32 regions, near the AS3MT (arsenite methyltransferase) gene. In a subsequent analysis of >2,000 individuals, we show for the first time that variants that influence arsenic metabolism can also influence risk for arsenical skin lesions (the classical sign of arsenic toxicity) through interaction with arsenic exposure. Using array-based genome-wide gene expression data, we show that several of our lead genetic variants are associated with expression of AS3MT and neighboring gene C10orf32, providing a potential mechanism by which 10q24.32 variants influence arsenic metabolism and toxicity. Knowledge of variation in this region and associated biological processes could be used to develop intervention and pharmacological strategies aimed at preventing large numbers of arsenic-related deaths in arsenic-exposed populations.
doi:10.1371/journal.pgen.1002522
PMCID: PMC3285587  PMID: 22383894
12.  A Genome-Wide Association Study Identified AFF1 as a Susceptibility Locus for Systemic Lupus Eyrthematosus in Japanese 
PLoS Genetics  2012;8(1):e1002455.
Systemic lupus erythematosus (SLE) is an autoimmune disease that causes multiple organ damage. Although recent genome-wide association studies (GWAS) have contributed to discovery of SLE susceptibility genes, few studies has been performed in Asian populations. Here, we report a GWAS for SLE examining 891 SLE cases and 3,384 controls and multi-stage replication studies examining 1,387 SLE cases and 28,564 controls in Japanese subjects. Considering that expression quantitative trait loci (eQTLs) have been implicated in genetic risks for autoimmune diseases, we integrated an eQTL study into the results of the GWAS. We observed enrichments of cis-eQTL positive loci among the known SLE susceptibility loci (30.8%) compared to the genome-wide SNPs (6.9%). In addition, we identified a novel association of a variant in the AF4/FMR2 family, member 1 (AFF1) gene at 4q21 with SLE susceptibility (rs340630; P = 8.3×10−9, odds ratio = 1.21). The risk A allele of rs340630 demonstrated a cis-eQTL effect on the AFF1 transcript with enhanced expression levels (P<0.05). As AFF1 transcripts were prominently expressed in CD4+ and CD19+ peripheral blood lymphocytes, up-regulation of AFF1 may cause the abnormality in these lymphocytes, leading to disease onset.
Author Summary
Although recent genome-wide association study (GWAS) approaches have successfully contributed to disease gene discovery, many susceptibility loci are known to be still uncaptured due to strict significance threshold for multiple hypothesis testing. Therefore, prioritization of GWAS results by incorporating additional information is recommended. Systemic lupus erythematosus (SLE) is an autoimmune disease that causes multiple organ damage. Considering that abnormalities in B cell activity play essential roles in SLE, prioritization based on an expression quantitative trait loci (eQTLs) study for B cells would be a promising approach. In this study, we report a GWAS and multi-stage replication studies for SLE examining 2,278 SLE cases and 31,948 controls in Japanese subjects. We integrated eQTL study into the results of the GWAS and identified AFF1 as a novel SLE susceptibility loci. We also confirmed cis-regulatory effect of the locus on the AFF1 transcript. Our study would be one of the initial successes for detecting novel genetic locus using the eQTL study, and it should contribute to our understanding of the genetic loci being uncaptured by standard GWAS approaches.
doi:10.1371/journal.pgen.1002455
PMCID: PMC3266877  PMID: 22291604
13.  Association of NCF2, IKZF1, IRF8, IFIH1, and TYK2 with Systemic Lupus Erythematosus 
PLoS Genetics  2011;7(10):e1002341.
Systemic lupus erythematosus (SLE) is a complex trait characterised by the production of a range of auto-antibodies and a diverse set of clinical phenotypes. Currently, ∼8% of the genetic contribution to SLE in Europeans is known, following publication of several moderate-sized genome-wide (GW) association studies, which identified loci with a strong effect (OR>1.3). In order to identify additional genes contributing to SLE susceptibility, we conducted a replication study in a UK dataset (870 cases, 5,551 controls) of 23 variants that showed moderate-risk for lupus in previous studies. Association analysis in the UK dataset and subsequent meta-analysis with the published data identified five SLE susceptibility genes reaching genome-wide levels of significance (Pcomb<5×10−8): NCF2 (Pcomb = 2.87×10−11), IKZF1 (Pcomb = 2.33×10−9), IRF8 (Pcomb = 1.24×10−8), IFIH1 (Pcomb = 1.63×10−8), and TYK2 (Pcomb = 3.88×10−8). Each of the five new loci identified here can be mapped into interferon signalling pathways, which are known to play a key role in the pathogenesis of SLE. These results increase the number of established susceptibility genes for lupus to ∼30 and validate the importance of using large datasets to confirm associations of loci which moderately increase the risk for disease.
Author Summary
Genome-wide association studies have revolutionised our ability to identify common susceptibility alleles for systemic lupus erythematosus (SLE). In complex diseases such as SLE, where many different genes make a modest contribution to disease susceptibility, it is necessary to perform large-scale association studies to combine results from several datasets, to have sufficient power to identify highly significant novel loci (P<5×10−8). Using a large SLE collection of 870 UK SLE cases and 5,551 UK unaffected individuals, we firstly replicated ten moderate-risk alleles (P<0.05) from a US–Swedish study of 3,273 SLE cases and 12,188 healthy controls. Combining our results with the US-Swedish data identified five new loci, which crossed the level for genome-wide significance: NCF2 (neutrophil cytosolic factor 2), IKZF1 (Ikaros family zinc-finger 1), IRF8 (interferon regulatory factor 8), IFIH1 (interferon-induced helicase C domain-containing protein 1), and TYK2 (tyrosine kinase 2). Each of these five genes regulates a different aspect of the immune response and contributes to the production of type-I and type-II interferons. Although further studies will be required to identify the causal alleles within these loci, the confirmation of five new susceptibility genes for lupus makes a significant step forward in our understanding of the genetic contribution to SLE.
doi:10.1371/journal.pgen.1002341
PMCID: PMC3203198  PMID: 22046141
14.  A Genome-Wide Meta-Analysis of Six Type 1 Diabetes Cohorts Identifies Multiple Associated Loci 
PLoS Genetics  2011;7(9):e1002293.
Diabetes impacts approximately 200 million people worldwide, of whom approximately 10% are affected by type 1 diabetes (T1D). The application of genome-wide association studies (GWAS) has robustly revealed dozens of genetic contributors to the pathogenesis of T1D, with the most recent meta-analysis identifying in excess of 40 loci. To identify additional genetic loci for T1D susceptibility, we examined associations in the largest meta-analysis to date between the disease and ∼2.54 million SNPs in a combined cohort of 9,934 cases and 16,956 controls. Targeted follow-up of 53 SNPs in 1,120 affected trios uncovered three new loci associated with T1D that reached genome-wide significance. The most significantly associated SNP (rs539514, P = 5.66×10−11) resides in an intronic region of the LMO7 (LIM domain only 7) gene on 13q22. The second most significantly associated SNP (rs478222, P = 3.50×10−9) resides in an intronic region of the EFR3B (protein EFR3 homolog B) gene on 2p23; however, the region of linkage disequilibrium is approximately 800 kb and harbors additional multiple genes, including NCOA1, C2orf79, CENPO, ADCY3, DNAJC27, POMC, and DNMT3A. The third most significantly associated SNP (rs924043, P = 8.06×10−9) lies in an intergenic region on 6q27, where the region of association is approximately 900 kb and harbors multiple genes including WDR27, C6orf120, PHF10, TCTE3, C6orf208, LOC154449, DLL1, FAM120B, PSMB1, TBP, and PCD2. These latest associated regions add to the growing repertoire of gene networks predisposing to T1D.
Author Summary
Despite the fact that there is clearly a large genetic component to type 1 diabetes (T1D), uncovering the genes contributing to this disease has proven challenging. However, in the past three years there has been relatively major progress in this regard, with advances in genetic screening technologies allowing investigators to scan the genome for variants conferring risk for disease without prior hypotheses. Such genome-wide association studies have revealed multiple regions of the genome to be robustly and consistently associated with T1D. More recent findings have been a consequence of combining of multiple datasets from independent investigators in meta-analyses, which have more power to pick up additional variants contributing to the trait. In the current study, we describe the largest meta-analysis of T1D genome-wide genotyped datasets to date, which combines six large studies. As a consequence, we have uncovered three new signals residing at the chromosomal locations 13q22, 2p23, and 6q27, which went on to be replicated in independent sample sets. These latest associated regions add to the growing repertoire of gene networks predisposing to T1D.
doi:10.1371/journal.pgen.1002293
PMCID: PMC3183083  PMID: 21980299
15.  A Genome-Wide Metabolic QTL Analysis in Europeans Implicates Two Loci Shaped by Recent Positive Selection 
PLoS Genetics  2011;7(9):e1002270.
We have performed a metabolite quantitative trait locus (mQTL) study of the 1H nuclear magnetic resonance spectroscopy (1H NMR) metabolome in humans, building on recent targeted knowledge of genetic drivers of metabolic regulation. Urine and plasma samples were collected from two cohorts of individuals of European descent, with one cohort comprised of female twins donating samples longitudinally. Sample metabolite concentrations were quantified by 1H NMR and tested for association with genome-wide single-nucleotide polymorphisms (SNPs). Four metabolites' concentrations exhibited significant, replicable association with SNP variation (8.6×10−11
Author Summary
Physiological concentrations of metabolites—small molecules involved in biochemical processes in living systems—can be measured and used to diagnose and predict disease states. A common goal is to detect and clinically exploit statistical differences in metabolite concentrations between diseased and healthy individuals. As a basis for the design and interpretation of case-control studies, it is useful to have a characterization of metabolic diversity amongst healthy individuals, some of which stems from inter-individual genetic variation. When a single genetic locus has a sufficiently strong effect on metabolism, its genomic position can be determined by collecting metabolite concentration data and genome-wide genotype data on a set of individuals and searching for associations between the two data sets—a so-called metabolite quantitative trait locus (mQTL) study. By so tracing mQTLs, we can identify the genetic drivers of metabolism, characterize how the nature or quantity of the corresponding expressed protein(s) feeds forward to influence metabolite levels, and specify disease-predictive models that incorporate mutual dependence amongst genetics, environment, and metabolism.
doi:10.1371/journal.pgen.1002270
PMCID: PMC3169529  PMID: 21931564
PLoS Genetics  2011;7(8):e1002215.
Metabolomic profiling and the integration of whole-genome genetic association data has proven to be a powerful tool to comprehensively explore gene regulatory networks and to investigate the effects of genetic variation at the molecular level. Serum metabolite concentrations allow a direct readout of biological processes, and association of specific metabolomic signatures with complex diseases such as Alzheimer's disease and cardiovascular and metabolic disorders has been shown. There are well-known correlations between sex and the incidence, prevalence, age of onset, symptoms, and severity of a disease, as well as the reaction to drugs. However, most of the studies published so far did not consider the role of sexual dimorphism and did not analyse their data stratified by gender. This study investigated sex-specific differences of serum metabolite concentrations and their underlying genetic determination. For discovery and replication we used more than 3,300 independent individuals from KORA F3 and F4 with metabolite measurements of 131 metabolites, including amino acids, phosphatidylcholines, sphingomyelins, acylcarnitines, and C6-sugars. A linear regression approach revealed significant concentration differences between males and females for 102 out of 131 metabolites (p-values<3.8×10−4; Bonferroni-corrected threshold). Sex-specific genome-wide association studies (GWAS) showed genome-wide significant differences in beta-estimates for SNPs in the CPS1 locus (carbamoyl-phosphate synthase 1, significance level: p<3.8×10−10; Bonferroni-corrected threshold) for glycine. We showed that the metabolite profiles of males and females are significantly different and, furthermore, that specific genetic variants in metabolism-related genes depict sexual dimorphism. Our study provides new important insights into sex-specific differences of cell regulatory processes and underscores that studies should consider sex-specific effects in design and interpretation.
Author Summary
The combination of genomic and metabolic studies during the last years has provided astonishing results. However, most of the studies published so far did not consider the role of sexual dimorphism and did not analyse their data stratified by sex. The investigation of 131 serum metabolite concentrations of >3,300 population-based samples (KORA F3/F4) revealed significant differences in the metabolite profile of males and females. Furthermore, a genome-wide picture of sex-specific genetic variations in human metabolism (>2,000 subjects from KORA F3/F4 cohorts) was investigated. Sex-specific genome-wide association studies (GWAS) showed differences in the effect of genetic variations on metabolites in men and women. SNPs in the CPS1 (carbamoyl-phosphate synthase 1) locus showed genome-wide significant differences in beta-estimates of sex-specific association analysis (significance level: 3.8×10−10) for glycine. As global metabolomic techniques are more and more refined to identify more compounds in single biological samples, the predictive power of this new technology will greatly increase. This suggests that metabolites, which may be used as predictive biomarkers to indicate the presence or severity of a disease, have to be used selectively depending on sex.
doi:10.1371/journal.pgen.1002215
PMCID: PMC3154959  PMID: 21852955
PLoS Genetics  2011;7(7):e1002177.
Genome-wide association studies (GWAS) are now used routinely to identify SNPs associated with complex human phenotypes. In several cases, multiple variants within a gene contribute independently to disease risk. Here we introduce a novel Gene-Wide Significance (GWiS) test that uses greedy Bayesian model selection to identify the independent effects within a gene, which are combined to generate a stronger statistical signal. Permutation tests provide p-values that correct for the number of independent tests genome-wide and within each genetic locus. When applied to a dataset comprising 2.5 million SNPs in up to 8,000 individuals measured for various electrocardiography (ECG) parameters, this method identifies more validated associations than conventional GWAS approaches. The method also provides, for the first time, systematic assessments of the number of independent effects within a gene and the fraction of disease-associated genes housing multiple independent effects, observed at 35%–50% of loci in our study. This method can be generalized to other study designs, retains power for low-frequency alleles, and provides gene-based p-values that are directly compatible for pathway-based meta-analysis.
Author Summary
Genome-wide association studies (GWAS) have successfully identified genetic variants associated with complex human phenotypes. Despite a proliferation of analysis methods, most studies rely on simple, robust SNP–by–SNP univariate tests with ever-larger population sizes. Here we introduce a new test motivated by the biological hypothesis that a single gene may contain multiple variants that contribute independently to a trait. Applied to simulated phenotypes with real genotypes, our new method, Gene-Wide Significance (GWiS), has better power to identify true associations than traditional univariate methods, previous Bayesian methods, popular L1 regularized (LASSO) multivariate regression, and other approaches. GWiS retains power for low-frequency alleles that are increasingly important for personal genetics, and it is the only method tested that accurately estimates the number of independent effects within a gene. When applied to human data for multiple ECG traits, GWiS identifies more genome-wide significant loci (verified by meta-analyses of much larger populations) than any other method. We estimate that 35%–50% of ECG trait loci are likely to have multiple independent effects, suggesting that our method will reveal previously unidentified associations when applied to existing data and will improve power for future association studies.
doi:10.1371/journal.pgen.1002177
PMCID: PMC3145613  PMID: 21829371
PLoS Genetics  2011;7(7):e1002193.
Long-chain n-3 polyunsaturated fatty acids (PUFAs) can derive from diet or from α-linolenic acid (ALA) by elongation and desaturation. We investigated the association of common genetic variation with plasma phospholipid levels of the four major n-3 PUFAs by performing genome-wide association studies in five population-based cohorts comprising 8,866 subjects of European ancestry. Minor alleles of SNPs in FADS1 and FADS2 (desaturases) were associated with higher levels of ALA (p = 3×10−64) and lower levels of eicosapentaenoic acid (EPA, p = 5×10−58) and docosapentaenoic acid (DPA, p = 4×10−154). Minor alleles of SNPs in ELOVL2 (elongase) were associated with higher EPA (p = 2×10−12) and DPA (p = 1×10−43) and lower docosahexaenoic acid (DHA, p = 1×10−15). In addition to genes in the n-3 pathway, we identified a novel association of DPA with several SNPs in GCKR (glucokinase regulator, p = 1×10−8). We observed a weaker association between ALA and EPA among carriers of the minor allele of a representative SNP in FADS2 (rs1535), suggesting a lower rate of ALA-to-EPA conversion in these subjects. In samples of African, Chinese, and Hispanic ancestry, associations of n-3 PUFAs were similar with a representative SNP in FADS1 but less consistent with a representative SNP in ELOVL2. Our findings show that common variation in n-3 metabolic pathway genes and in GCKR influences plasma phospholipid levels of n-3 PUFAs in populations of European ancestry and, for FADS1, in other ancestries.
Author Summary
Circulating long-chain n-3 polyunsaturated fatty acids (PUFAs) derive from fatty fish or from the conversion of the plant n-3 PUFA by elongation and desaturation. We looked for common genetic markers throughout the genome that might influence plasma phospholipid levels of the four major n-3 PUFAs in five large studies and pooled the results. We found that levels of all four n-3 PUFAs were associated with genetic markers in known desaturation and elongation genes. We also found evidence that conversion of the plant n-3 PUFA to longer chain n-3 PUFAs is less effective in people with certain desaturation-gene markers, which could be important for people who do not eat fish. We also found a marker in a gene involved in glucose metabolism, called the glucokinase regulator, to be associated with one intermediate n-3 PUFA. Some of these findings were seen across multiple race/ethnicities. Overall, these results have implications for how genes and the environment interact to influence circulating levels of fatty acids.
doi:10.1371/journal.pgen.1002193
PMCID: PMC3145614  PMID: 21829377
PLoS Genetics  2011;7(7):e1002170.
Asthma is a complex phenotype influenced by genetic and environmental factors. We conducted a genome-wide association study (GWAS) with 938 Japanese pediatric asthma patients and 2,376 controls. Single-nucleotide polymorphisms (SNPs) showing strong associations (P<1×10−8) in GWAS were further genotyped in an independent Japanese samples (818 cases and 1,032 controls) and in Korean samples (835 cases and 421 controls). SNP rs987870, located between HLA-DPA1 and HLA-DPB1, was consistently associated with pediatric asthma in 3 independent populations (Pcombined = 2.3×10−10, odds ratio [OR] = 1.40). HLA-DP allele analysis showed that DPA1*0201 and DPB1*0901, which were in strong linkage disequilibrium, were strongly associated with pediatric asthma (DPA1*0201: P = 5.5×10−10, OR = 1.52, and DPB1*0901: P = 2.0×10−7, OR = 1.49). Our findings show that genetic variants in the HLA-DP locus are associated with the risk of pediatric asthma in Asian populations.
Author Summary
Asthma is the most common chronic disorder in children, and asthma exacerbation is an important cause of childhood morbidity and hospitalization. Here, taking advantage of recent technological advances in human genetics, we performed a genome-wide association study and follow-up validation studies to identify genetic variants for asthma. By examining 6,428 Asians, we found rs987870 and HLA-DPA1*0201/DPB1*0901 were associated with pediatric asthma. The association signal was stretched in the region of HLA-DPB2, collagen, type XI, alpha 2 (COL11A2), and Retinoid X receptor beta (RXRB), but strong linkage disequilibrium in this region made it difficult to specifically identify causative variants. Interestingly, the SNP (or the HLA-DP allele) associated with pediatric asthma (Th-2 type immune diseases) in the present study confers protection against Th-1 type immune diseases, such as type 1 diabetes and rheumatoid arthritis. Therefore, the association results obtained in the present study could partially explain the inverse relationship between asthma and Th-1 type immune diseases and may lead to better understanding of Th-1/Th-2 immune diseases.
doi:10.1371/journal.pgen.1002170
PMCID: PMC3140987  PMID: 21814517
PLoS Genetics  2011;7(7):e1002091.
Systemic sclerosis (SSc) is an orphan, complex, inflammatory disease affecting the immune system and connective tissue. SSc stands out as a severely incapacitating and life-threatening inflammatory rheumatic disease, with a largely unknown pathogenesis. We have designed a two-stage genome-wide association study of SSc using case-control samples from France, Italy, Germany, and Northern Europe. The initial genome-wide scan was conducted in a French post quality-control sample of 564 cases and 1,776 controls, using almost 500 K SNPs. Two SNPs from the MHC region, together with the 6 loci outside MHC having at least one SNP with a P<10−5 were selected for follow-up analysis. These markers were genotyped in a post-QC replication sample of 1,682 SSc cases and 3,926 controls. The three top SNPs are in strong linkage disequilibrium and located on 6p21, in the HLA-DQB1 gene: rs9275224, P = 9.18×10−8, OR = 0.69, 95% CI [0.60–0.79]; rs6457617, P = 1.14×10−7 and rs9275245, P = 1.39×10−7. Within the MHC region, the next most associated SNP (rs3130573, P = 1.86×10−5, OR = 1.36 [1.18–1.56]) is located in the PSORS1C1 gene. Outside the MHC region, our GWAS analysis revealed 7 top SNPs (P<10−5) that spanned 6 independent genomic regions. Follow-up of the 17 top SNPs in an independent sample of 1,682 SSc and 3,926 controls showed associations at PSORS1C1 (overall P = 5.70×10−10, OR:1.25), TNIP1 (P = 4.68×10−9, OR:1.31), and RHOB loci (P = 3.17×10−6, OR:1.21). Because of its biological relevance, and previous reports of genetic association at this locus with connective tissue disorders, we investigated TNIP1 expression. A markedly reduced expression of the TNIP1 gene and also its protein product were observed both in lesional skin tissue and in cultured dermal fibroblasts from SSc patients. Furthermore, TNIP1 showed in vitro inhibitory effects on inflammatory cytokine-induced collagen production. The genetic signal of association with TNIP1 variants, together with tissular and cellular investigations, suggests that this pathway has a critical role in regulating autoimmunity and SSc pathogenesis.
Author Summary
Systemic sclerosis (SSc) is a connective tissue disease characterized by generalized microangiopathy, severe immunologic alterations, and massive deposits of matrix components in the connective tissue. Epidemiological investigations indicate that SSc follows a pattern of multifactorial inheritance; however, only a few loci have been replicated in multiple studies. We undertook a two-stage genome-wide association study of SSc involving over 8,800 individuals of European ancestry. Combined analyses showed independent association at the known HLA-DQB1 region and revealed associations at PSORS1C1, TNIP1, and RHOB loci, in agreement with a strong immune genetic component. Because of its biological relevance, and previous reports of genetic association at this locus with other connective tissue disorders, we investigated TNIP1 expression. We observed a markedly reduced expression of the gene and its protein product in SSc, as well as its potential implication in control of extra-cellular matrix synthesis, providing a new clue for a link between inflammation/immunity and fibrosis.
doi:10.1371/journal.pgen.1002091
PMCID: PMC3131285  PMID: 21750679
PLoS Genetics  2011;7(6):e1002150.
Single nucleotide polymorphisms (SNPs) in MYH9 and APOL1 on chromosome 22 (c22) are powerfully associated with non-diabetic end-stage renal disease (ESRD) in African Americans (AAs). Many AAs diagnosed with type 2 diabetic nephropathy (T2DN) have non-diabetic kidney disease, potentially masking detection of DN genes. Therefore, genome-wide association analyses were performed using the Affymetrix SNP Array 6.0 in 966 AA with T2DN and 1,032 non-diabetic, non-nephropathy (NDNN) controls, with and without adjustment for c22 nephropathy risk variants. No associations were seen between FRMD3 SNPs and T2DN before adjusting for c22 variants. However, logistic regression analysis revealed seven FRMD3 SNPs significantly interacting with MYH9—a finding replicated in 640 additional AA T2DN cases and 683 NDNN controls. Contrasting all 1,592 T2DN cases with all 1,671 NDNN controls, FRMD3 SNPs appeared to interact with the MYH9 E1 haplotype (e.g., rs942280 interaction p-value = 9.3E−7 additive; odds ratio [OR] 0.67). FRMD3 alleles were associated with increased risk of T2DN only in subjects lacking two MYH9 E1 risk haplotypes (rs942280 OR = 1.28), not in MYH9 E1 risk allele homozygotes (rs942280 OR = 0.80; homogeneity p-value = 4.3E−4). Effects were weaker stratifying on APOL1. FRMD3 SNPS were associated with T2DN, not type 2 diabetes per se, comparing AAs with T2DN to those with diabetes lacking nephropathy. T2DN-associated FRMD3 SNPs were detectable in AAs only after accounting for MYH9, with differential effects for APOL1. These analyses reveal a role for FRMD3 in AA T2DN susceptibility and accounting for c22 nephropathy risk variants can assist in detecting DN susceptibility genes.
Author Summary
African Americans have high rates of kidney disease attributed to type 2 diabetes mellitus. However, approximately 25% of patients are misclassified and have non-diabetic kidney disease on renal biopsy. The APOL1-MYH9 gene region on chromosome 22 is powerfully associated with non-diabetic kidney diseases in African Americans. Therefore, we tested for interactions between single nucleotide polymorphisms across the genome with APOL1 and MYH9 non-diabetic nephropathy risk variants in African Americans with presumed diabetic nephropathy. Markers in FRMD3, a gene associated with type 1 diabetic nephropathy in Caucasians, appeared to interact with MYH9; however, increased nephropathy risk was seen in diabetic cases lacking two MYH9 risk haplotypes, and protective effects were seen in those with two MYH9 risk haplotypes. Stratified analyses based on the chromosome 22 nephropathy risk haplotypes demonstrated that FRMD3 variants were associated with diabetic nephropathy risk in cases without two MYH9 (or APOL1) risk haplotypes. It appears that African Americans with diabetes and kidney disease who are not chromosome 22 nephropathy risk variant homozygotes are enriched for the presence of diabetic nephropathy and FRMD3 risk alleles. This genetic dissection ultimately allowed for detection of the FRMD3 diabetic nephropathy gene association in a subset of cases enriched for this disorder.
doi:10.1371/journal.pgen.1002150
PMCID: PMC3116917  PMID: 21698141
PLoS Genetics  2011;7(3):e1001324.
Nonalcoholic fatty liver disease (NAFLD) clusters in families, but the only known common genetic variants influencing risk are near PNPLA3. We sought to identify additional genetic variants influencing NAFLD using genome-wide association (GWA) analysis of computed tomography (CT) measured hepatic steatosis, a non-invasive measure of NAFLD, in large population based samples. Using variance components methods, we show that CT hepatic steatosis is heritable (∼26%–27%) in family-based Amish, Family Heart, and Framingham Heart Studies (n = 880 to 3,070). By carrying out a fixed-effects meta-analysis of genome-wide association (GWA) results between CT hepatic steatosis and ∼2.4 million imputed or genotyped SNPs in 7,176 individuals from the Old Order Amish, Age, Gene/Environment Susceptibility-Reykjavik study (AGES), Family Heart, and Framingham Heart Studies, we identify variants associated at genome-wide significant levels (p<5×10−8) in or near PNPLA3, NCAN, and PPP1R3B. We genotype these and 42 other top CT hepatic steatosis-associated SNPs in 592 subjects with biopsy-proven NAFLD from the NASH Clinical Research Network (NASH CRN). In comparisons with 1,405 healthy controls from the Myocardial Genetics Consortium (MIGen), we observe significant associations with histologic NAFLD at variants in or near NCAN, GCKR, LYPLAL1, and PNPLA3, but not PPP1R3B. Variants at these five loci exhibit distinct patterns of association with serum lipids, as well as glycemic and anthropometric traits. We identify common genetic variants influencing CT–assessed steatosis and risk of NAFLD. Hepatic steatosis associated variants are not uniformly associated with NASH/fibrosis or result in abnormalities in serum lipids or glycemic and anthropometric traits, suggesting genetic heterogeneity in the pathways influencing these traits.
Author Summary
NAFLD is a spectrum of disease that ranges from steatosis to steatohepatitis (nonalcoholic steatohepatitis or NASH: inflammation around the fat) to fibrosis/cirrhosis. Hepatic steatosis can be measured non-invasively using computed tomography (CT) whereas NASH/fibrosis is assessed histologically. The genetic underpinnings of NAFLD remain to be determined. Here we estimate that 26%–27% of the variation in CT measured hepatic steatosis is heritable or genetic. We identify three variants near PNPLAL3, NCAN, and PPP1R3B that associate with CT hepatic steatosis and show that variants in or near NCAN, GCKR, LYPLAL1, and PNPLA3, but not PPP1R3B, associate with histologic lobular inflammation/fibrosis. Variants in or near NCAN, GCKR, and PPP1R3B associate with altered serum lipid levels, whereas those in or near LYPLAL1 and PNPLA3 do not. Variants near GCKR and PPP1R3B also affect glycemic traits. Thus, we show that NAFLD is genetically influenced and expand the number of common genetic variants that associate with this trait. Our findings suggest that development of hepatic steatosis, NASH/fibrosis, or abnormalities in metabolic traits are probably influenced by different metabolic pathways that may represent distinct therapeutic targets.
doi:10.1371/journal.pgen.1001324
PMCID: PMC3053321  PMID: 21423719
PLoS Genetics  2011;7(2):e1001311.
Systemic lupus erythematosus (SLE) is a genetically complex disease with heterogeneous clinical manifestations. Recent studies have greatly expanded the number of established SLE risk alleles, but the distribution of multiple risk alleles in cases versus controls and their relationship to subphenotypes have not been studied. We studied 22 SLE susceptibility polymorphisms with previous genome-wide evidence of association (p<5×10−8) in 1919 SLE cases from 9 independent Caucasian SLE case series and 4813 independent controls. The mean number of risk alleles in cases was 15.1 (SD 3.1) while the mean in controls was 13.1 (SD 2.8), with trend p = 4×10−128. We defined a genetic risk score (GRS) for SLE as the number of risk alleles with each weighted by the SLE risk odds ratio (OR). The OR for high-low GRS tertiles, adjusted for intra-European ancestry, sex, and parent study, was 4.4 (95% CI 3.8–5.1). We studied associations of individual SNPs and the GRS with clinical manifestations for the cases: age at diagnosis, the 11 American College of Rheumatology classification criteria, and double-stranded DNA antibody (anti-dsDNA) production. Six subphenotypes were significantly associated with the GRS, most notably anti-dsDNA (ORhigh-low = 2.36, p = 9e−9), the immunologic criterion (ORhigh-low = 2.23, p = 3e−7), and age at diagnosis (ORhigh-low = 1.45, p = 0.0060). Finally, we developed a subphenotype-specific GRS (sub-GRS) for each phenotype with more power to detect cumulative genetic associations. The sub-GRS was more strongly associated than any single SNP effect for 5 subphenotypes (the above plus hematologic disorder and oral ulcers), while single loci are more significantly associated with renal disease (HLA-DRB1, OR = 1.37, 95% CI 1.14–1.64) and arthritis (ITGAM, OR = 0.72, 95% CI 0.59–0.88). We did not observe significant associations for other subphenotypes, for individual loci or the sub-GRS. Thus our analysis categorizes SLE subphenotypes into three groups: those having cumulative, single, and no known genetic association with respect to the currently established SLE risk loci.
Author Summary
Systemic lupus erythematosus is a chronic disabling autoimmune disease, most commonly striking women in their thirties or forties. It can cause a wide variety of clinical manifestations, including kidney disease, arthritis, and skin disorders. Prognosis varies greatly depending on these clinical features, with kidney disease and related characteristics leading to greater morbidity and mortality. It is also complex genetically; while lupus runs in families, genes increase one's risk for lupus but do not fully determine the outcome. The interactions of multiple genes and/or interactions between genes and environmental factors may cause lupus, but the causes and disease pathways of this very heterogeneous disease are not well understood. By examining relationships between the presence of multiple lupus risk genes, lupus susceptibility, and clinical manifestations, we hope to better understand how lupus is triggered and by what biological pathways it progresses. We show in this work that certain clinical manifestations of lupus are highly associated with cumulative genetic variations, i.e. multiple risk alleles, while others are associated with a single variation or none at all.
doi:10.1371/journal.pgen.1001311
PMCID: PMC3040652  PMID: 21379322
PLoS Genetics  2011;7(2):e1001307.
An age-dependent association between variation at the FTO locus and BMI in children has been suggested. We meta-analyzed associations between the FTO locus (rs9939609) and BMI in samples, aged from early infancy to 13 years, from 8 cohorts of European ancestry. We found a positive association between additional minor (A) alleles and BMI from 5.5 years onwards, but an inverse association below age 2.5 years. Modelling median BMI curves for each genotype using the LMS method, we found that carriers of minor alleles showed lower BMI in infancy, earlier adiposity rebound (AR), and higher BMI later in childhood. Differences by allele were consistent with two independent processes: earlier AR equivalent to accelerating developmental age by 2.37% (95% CI 1.87, 2.87, p = 10−20) per A allele and a positive age by genotype interaction such that BMI increased faster with age (p = 10−23). We also fitted a linear mixed effects model to relate genotype to the BMI curve inflection points adiposity peak (AP) in infancy and AR. Carriage of two minor alleles at rs9939609 was associated with lower BMI at AP (−0.40% (95% CI: −0.74, −0.06), p = 0.02), higher BMI at AR (0.93% (95% CI: 0.22, 1.64), p = 0.01), and earlier AR (−4.72% (−5.81, −3.63), p = 10−17), supporting cross-sectional results. Overall, we confirm the expected association between variation at rs9939609 and BMI in childhood, but only after an inverse association between the same variant and BMI in infancy. Patterns are consistent with a shift on the developmental scale, which is reflected in association with the timing of AR rather than just a global increase in BMI. Results provide important information about longitudinal gene effects and about the role of FTO in adiposity. The associated shifts in developmental timing have clinical importance with respect to known relationships between AR and both later-life BMI and metabolic disease risk.
Author Summary
Variation at the FTO locus is reliably associated with BMI and adiposity-related traits, but little is still known about the effects of variation at this gene, particularly in children. We have examined a large collection of samples for which both genotypes at rs9939609 and multiple measurements of BMI are available. We observe a positive association between the minor allele (A) at rs9939609 and BMI emerging in childhood that has the characteristics of a shift in the age scale leading simultaneously to lower BMI during infancy and higher BMI in childhood. Assessed in cross section and longitudinally, we find evidence of variation at rs9939609 being associated with the timing of AR and the concert of events expected with such a change to the BMI curve. Importantly, the apparently negative association between the minor allele (A) and BMI in early life, which is then followed by an earlier AR and greater BMI in childhood, is a pattern known to be associated with both the risk of adult BMI and metabolic disorders such as type 2 diabetes (T2D). These findings are important in our understanding of the contribution of FTO to adiposity, but also in light of efforts to appreciate genetic effects in a lifecourse context.
doi:10.1371/journal.pgen.1001307
PMCID: PMC3040655  PMID: 21379325
PLoS Genetics  2011;7(2):e1002003.
While there have been studies exploring regulatory variation in one or more tissues, the complexity of tissue-specificity in multiple primary tissues is not yet well understood. We explore in depth the role of cis-regulatory variation in three human tissues: lymphoblastoid cell lines (LCL), skin, and fat. The samples (156 LCL, 160 skin, 166 fat) were derived simultaneously from a subset of well-phenotyped healthy female twins of the MuTHER resource. We discover an abundance of cis-eQTLs in each tissue similar to previous estimates (858 or 4.7% of genes). In addition, we apply factor analysis (FA) to remove effects of latent variables, thus more than doubling the number of our discoveries (1,822 eQTL genes). The unique study design (Matched Co-Twin Analysis—MCTA) permits immediate replication of eQTLs using co-twins (93%–98%) and validation of the considerable gain in eQTL discovery after FA correction. We highlight the challenges of comparing eQTLs between tissues. After verifying previous significance threshold-based estimates of tissue-specificity, we show their limitations given their dependency on statistical power. We propose that continuous estimates of the proportion of tissue-shared signals and direct comparison of the magnitude of effect on the fold change in expression are essential properties that jointly provide a biologically realistic view of tissue-specificity. Under this framework we demonstrate that 30% of eQTLs are shared among the three tissues studied, while another 29% appear exclusively tissue-specific. However, even among the shared eQTLs, a substantial proportion (10%–20%) have significant differences in the magnitude of fold change between genotypic classes across tissues. Our results underline the need to account for the complexity of eQTL tissue-specificity in an effort to assess consequences of such variants for complex traits.
Author Summary
Regulation of gene expression is a fundamental cellular process determining a large proportion of the phenotypic variance. Previous studies have identified genetic loci influencing gene expression levels (eQTLs), but the complexity of their tissue-specific properties has not yet been well-characterized. In this study, we perform cis-eQTL analysis in a unique matched co-twin design for three human tissues derived simultaneously from the same set of individuals. The study design allows validation of the substantial discoveries we make in each tissue. We explore in depth the tissue-dependent features of regulatory variants and estimate the proportions of shared and specific effects. We use continuous measures of eQTL sharing to circumvent the statistical power limitations of comparing direct overlap of eQTLs in multiple tissues. In this framework, we demonstrate that 30% of eQTLs are shared among tissues, while 29% are exclusively tissue-specific. Furthermore, we show that the fold change in expression between eQTL genotypic classes differs between tissues. Even among shared eQTLs, we report a substantial proportion (10%–20%) of significant tissue differences in magnitude of these effects. The complexities we highlight here are essential for understanding the impact of regulatory variants on complex traits.
doi:10.1371/journal.pgen.1002003
PMCID: PMC3033383  PMID: 21304890

Results 1-25 (41)