PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-25 (1017295)

Clipboard (0)
None

Related Articles

1.  Lung eQTLs to Help Reveal the Molecular Underpinnings of Asthma 
PLoS Genetics  2012;8(11):e1003029.
Genome-wide association studies (GWAS) have identified loci reproducibly associated with pulmonary diseases; however, the molecular mechanism underlying these associations are largely unknown. The objectives of this study were to discover genetic variants affecting gene expression in human lung tissue, to refine susceptibility loci for asthma identified in GWAS studies, and to use the genetics of gene expression and network analyses to find key molecular drivers of asthma. We performed a genome-wide search for expression quantitative trait loci (eQTL) in 1,111 human lung samples. The lung eQTL dataset was then used to inform asthma genetic studies reported in the literature. The top ranked lung eQTLs were integrated with the GWAS on asthma reported by the GABRIEL consortium to generate a Bayesian gene expression network for discovery of novel molecular pathways underpinning asthma. We detected 17,178 cis- and 593 trans- lung eQTLs, which can be used to explore the functional consequences of loci associated with lung diseases and traits. Some strong eQTLs are also asthma susceptibility loci. For example, rs3859192 on chr17q21 is robustly associated with the mRNA levels of GSDMA (P = 3.55×10−151). The genetic-gene expression network identified the SOCS3 pathway as one of the key drivers of asthma. The eQTLs and gene networks identified in this study are powerful tools for elucidating the causal mechanisms underlying pulmonary disease. This data resource offers much-needed support to pinpoint the causal genes and characterize the molecular function of gene variants associated with lung diseases.
Author Summary
Recent genome-wide association studies (GWAS) have identified genetic variants associated with lung diseases. The challenge now is to find the causal genes in GWAS–nominated chromosomal regions and to characterize the molecular function of disease-associated genetic variants. In this paper, we describe an international effort to systematically capture the genetic architecture of gene expression regulation in human lung. By studying lung specimens from 1,111 individuals of European ancestry, we found a large number of genetic variants affecting gene expression in the lung, or lung expression quantitative trait loci (eQTL). These lung eQTLs will serve as an important resource to aid in the understanding of the molecular underpinnings of lung biology and its disruption in disease. To demonstrate the utility of this lung eQTL dataset, we integrated our data with previous genetic studies on asthma. Through integrative techniques, we identified causal variants and genes in GWAS–nominated loci and found key molecular drivers for asthma. We feel that sharing our lung eQTLs dataset with the scientific community will leverage the impact of previous large-scale GWAS on lung diseases and function by providing much needed functional information to understand the molecular changes introduced by the susceptibility genetic variants.
doi:10.1371/journal.pgen.1003029
PMCID: PMC3510026  PMID: 23209423
2.  Integrated genomic approaches to identification of candidate genes underlying metabolic and cardiovascular phenotypes in the spontaneously hypertensive rat 
Physiological Genomics  2011;43(21):1207-1218.
The spontaneously hypertensive rat (SHR) is a widely used rodent model of hypertension and metabolic syndrome. Previously we identified thousands of cis-regulated expression quantitative trait loci (eQTLs) across multiple tissues using a panel of rat recombinant inbred (RI) strains derived from Brown Norway and SHR progenitors. These cis-eQTLs represent potential susceptibility loci underlying physiological and pathophysiological traits manifested in SHR. We have prioritized 60 cis-eQTLs and confirmed differential expression between the parental strains by quantitative PCR in 43 (72%) of the eQTL transcripts. Quantitative trait transcript (QTT) analysis in the RI strains showed highly significant correlation between cis-eQTL transcript abundance and clinically relevant traits such as systolic blood pressure and blood glucose, with the physical location of a subset of the cis-eQTLs colocalizing with “physiological” QTLs (pQTLs) for these same traits. These colocalizing correlated cis-eQTLs (c3-eQTLs) are highly attractive as primary susceptibility loci for the colocalizing pQTLs. Furthermore, sequence analysis of the c3-eQTL genes identified single nucleotide polymorphisms (SNPs) that are predicted to affect transcription factor binding affinity, splicing and protein function. These SNPs, which potentially alter transcript abundance and stability, represent strong candidate factors underlying not just eQTL expression phenotypes, but also the correlated metabolic and physiological traits. In conclusion, by integration of genomic sequence, eQTL and QTT datasets we have identified several genes that are strong positional candidates for pathophysiological traits observed in the SHR strain. These findings provide a basis for the functional testing and ultimate elucidation of the molecular basis of these metabolic and cardiovascular phenotypes.
doi:10.1152/physiolgenomics.00210.2010
PMCID: PMC3217321  PMID: 21846806
expression quantitative trait locus; spontaneously hypertensive rat; quantitative trait transcript; sequence variation
3.  Liver expression quantitative trait loci: a foundation for pharmacogenomic research 
Frontiers in Genetics  2012;3:153.
Expression quantitative trait loci (eQTL) analysis can provide insights into the genetic regulation of gene expression at a genomic level and this information is proving extremely useful in many different areas of research. As a consequence of the role of the liver in drug metabolism and disposition, the study of eQTLs in primary human liver tissue could provide a foundation for pharmacogenomics. Thus far, four genome-wide eQTL studies have been performed using human livers. Many liver eQTLs have been found to be reproducible and a proportion of these may be specific to the liver. Already these data have been used to interpret and inform clinic genome-wide association studies, providing potential mechanistic evidence for clinical associations and identifying genes which may impact clinical phenotypes. However, the utility of liver eQTL data has not yet been fully explored or realized in pharmacogenomics. As further liver eQTL research is undertaken, the genetic regulation of gene expression will become much better characterized and this knowledge will create a rational basis for the prospective pharmacogenomic study of many drugs.
doi:10.3389/fgene.2012.00153
PMCID: PMC3418580  PMID: 22912647
liver; expression quantitative trait loci; ADME genes; GWAS; clinical pharmacogenomics
4.  Identification of an imprinted master trans-regulator at the KLF14 locus related to multiple metabolic phenotypes 
Nature genetics  2011;43(6):561-564.
Genome-wide association studies have identified many genetic variants associated with complex traits. However, at only a minority of loci have the molecular mechanisms mediating these associations been characterized. In parallel, whilst cis-regulatory patterns of gene expression have been extensively explored, the identification of trans-regulatory effects in humans has attracted less attention. We demonstrate that the Type 2 diabetes and HDL-cholesterol associated cis-acting eQTL of the maternally-expressed transcription factor KLF14 acts as a master trans-regulator of adipose gene expression. Expression levels of genes regulated by this trans-eQTL are highly-correlated with concurrently-measured metabolic traits, and a subset of the trans-genes harbor variants directly-associated with metabolic phenotypes. This trans-eQTL network provides a mechanistic understanding of the effect of the KLF14 locus on metabolic disease risk, providing a potential model for other complex traits.
doi:10.1038/ng.833
PMCID: PMC3192952  PMID: 21572415
5.  A statistical approach to finding overlooked genetic associations 
BMC Bioinformatics  2010;11:526.
Background
Complexity and noise in expression quantitative trait loci (eQTL) studies make it difficult to distinguish potential regulatory relationships among the many interactions. The predominant method of identifying eQTLs finds associations that are significant at a genome-wide level. The vast number of statistical tests carried out on these data make false negatives very likely. Corrections for multiple testing error render genome-wide eQTL techniques unable to detect modest regulatory effects.
We propose an alternative method to identify eQTLs that builds on traditional approaches. In contrast to genome-wide techniques, our method determines the significance of an association between an expression trait and a locus with respect to the set of all associations to the expression trait. The use of this specific information facilitates identification of expression traits that have an expression profile that is characterized by a single exceptional association to a locus.
Our approach identifies expression traits that have exceptional associations regardless of the genome-wide significance of those associations. This property facilitates the identification of possible false negatives for genome-wide significance. Further, our approach has the property of prioritizing expression traits that are affected by few strong associations. Expression traits identified by this method may warrant additional study because their expression level may be affected by targeting genes near a single locus.
Results
We demonstrate our method by identifying eQTL hotspots in Plasmodium falciparum (malaria) and Saccharomyces cerevisiae (yeast). We demonstrate the prioritization of traits with few strong genetic effects through Gene Ontology (GO) analysis of Yeast. Our results are strongly consistent with results gathered using genome-wide methods and identify additional hotspots and eQTLs.
Conclusions
New eQTLs and hotspots found with this method may represent regions of the genome or biological processes that are controlled through few relatively strong genetic interactions. These points of interest warrant experimental investigation.
doi:10.1186/1471-2105-11-526
PMCID: PMC2974753  PMID: 20964847
6.  Higher-order chromatin domains link eQTLs with the expression of far-away genes 
Nucleic Acids Research  2013;42(1):87-96.
Distal expression quantitative trait loci (distal eQTLs) are genetic mutations that affect the expression of genes genomically far away. However, the mechanisms that cause a distal eQTL to modulate gene expression are not yet clear. Recent high-resolution chromosome conformation capture experiments along with a growing database of eQTLs provide an opportunity to understand the spatial mechanisms influencing distal eQTL associations on a genome-wide scale. We test the hypothesis that spatial proximity contributes to eQTL-gene regulation in the context of the higher-order domain structure of chromatin as determined from recent Hi-C chromosome conformation experiments. This analysis suggests that the large-scale topology of chromatin is coupled with eQTL associations by providing evidence that eQTLs are in general spatially close to their target genes, occur often around topological domain boundaries and preferentially associate with genes across domains. We also find that within-domain eQTLs that overlap with regulatory elements such as promoters and enhancers are spatially more close than the overall set of within-domain eQTLs, suggesting that spatial proximity derived from the domain structure in chromatin plays an important role in the regulation of gene expression.
doi:10.1093/nar/gkt857
PMCID: PMC3874174  PMID: 24089144
7.  Coanalysis of GWAS with eQTLs reveals disease-tissue associations 
Expression quantitative trait loci (eQTL), or genetic variants associated with changes in gene expression, have the potential to assist in interpreting results of genome-wide association studies (GWAS). eQTLs also have varying degrees of tissue specificity. By correlating the statistical significance of eQTLs mapped in various tissue types to their odds ratios reported in a large GWAS by the Wellcome Trust Case Control Consortium (WTCCC), we discovered that there is a significant association between diseases studied genetically and their relevant tissues. This suggests that eQTL data sets can be used to determine tissues that play a role in the pathogenesis of a disease, thereby highlighting these tissue types for further post-GWAS functional studies.
PMCID: PMC3392070  PMID: 22779046
8.  Genomics of ADME gene expression: mapping expression quantitative trait loci relevant for absorption, distribution, metabolism and excretion of drugs in human liver 
The Pharmacogenomics Journal  2011;13(1):12-20.
Expression quantitative trait loci (eQTL) analysis is a powerful approach toward identifying genetic loci associated with quantitative changes in gene expression. We applied genome-wide association analysis to a data set of >300 000 single-nucleotide polymorphisms and >48 000 mRNA expression phenotypes obtained by Illumina microarray profiling of 149 human surgical liver samples obtained from Caucasian donors with detailed medical documentation. Of 1226 significant associations, only 200 were validated when comparing with a previously published similar study. Potential explanations for low replication rate include differences in microarray platforms, statistical modeling, covariates considered and origin and collection procedures of tissues. Focused analysis revealed a subset of 95 associations related to absorption, distribution, metabolism and excretion of drugs. Of these, 21 were true replications and 74 were newly discovered associations in enzymes, transporters, transcriptional regulators and other genes. This study extends our knowledge about the genetics of inter-individual variability of gene expression with particular emphasis on pharmacogenomics.
doi:10.1038/tpj.2011.44
PMCID: PMC3564008  PMID: 22006096
ADME; eQTL; gene expression; pharmacogenetics; quantitative trait loci; SNP
9.  Effectively Identifying eQTLs from Multiple Tissues by Combining Mixed Model and Meta-analytic Approaches 
PLoS Genetics  2013;9(6):e1003491.
Gene expression data, in conjunction with information on genetic variants, have enabled studies to identify expression quantitative trait loci (eQTLs) or polymorphic locations in the genome that are associated with expression levels. Moreover, recent technological developments and cost decreases have further enabled studies to collect expression data in multiple tissues. One advantage of multiple tissue datasets is that studies can combine results from different tissues to identify eQTLs more accurately than examining each tissue separately. The idea of aggregating results of multiple tissues is closely related to the idea of meta-analysis which aggregates results of multiple genome-wide association studies to improve the power to detect associations. In principle, meta-analysis methods can be used to combine results from multiple tissues. However, eQTLs may have effects in only a single tissue, in all tissues, or in a subset of tissues with possibly different effect sizes. This heterogeneity in terms of effects across multiple tissues presents a key challenge to detect eQTLs. In this paper, we develop a framework that leverages two popular meta-analysis methods that address effect size heterogeneity to detect eQTLs across multiple tissues. We show by using simulations and multiple tissue data from mouse that our approach detects many eQTLs undetected by traditional eQTL methods. Additionally, our method provides an interpretation framework that accurately predicts whether an eQTL has an effect in a particular tissue.
Author Summary
The combination of gene expression and genetic variation data has enabled the identification of genetic variants that affect gene expression levels. It has been shown that some variants influence gene expression in only one tissue while others influence gene expression in multiple tissues. However, an analysis of multiple tissue data using traditional statistical methods typically fails to identify those variants that affect multiple tissues because each tissue is treated independently and due to low statistical power, the effect in a given tissue may be missed. Building on recent advances in statistical methods for meta-analysis and mixed models, we present a novel method that combines information from multiple tissues to identify genetic variation that affects multiple tissues. We show that our method detects more genetic variation that influences multiple tissues than traditional statistical methods both on simulated and real data.
doi:10.1371/journal.pgen.1003491
PMCID: PMC3681686  PMID: 23785294
10.  Mapping transcription mechanisms from multimodal genomic data 
BMC Bioinformatics  2010;11(Suppl 9):S2.
Background
Identification of expression quantitative trait loci (eQTLs) is an emerging area in genomic study. The task requires an integrated analysis of genome-wide single nucleotide polymorphism (SNP) data and gene expression data, raising a new computational challenge due to the tremendous size of data.
Results
We develop a method to identify eQTLs. The method represents eQTLs as information flux between genetic variants and transcripts. We use information theory to simultaneously interrogate SNP and gene expression data, resulting in a Transcriptional Information Map (TIM) which captures the network of transcriptional information that links genetic variations, gene expression and regulatory mechanisms. These maps are able to identify both cis- and trans- regulating eQTLs. The application on a dataset of leukemia patients identifies eQTLs in the regions of the GART, PCP4, DSCAM, and RIPK4 genes that regulate ADAMTS1, a known leukemia correlate.
Conclusions
The information theory approach presented in this paper is able to infer the dependence networks between SNPs and transcripts, which in turn can identify cis- and trans-eQTLs. The application of our method to the leukemia study explains how genetic variants and gene expression are linked to leukemia.
doi:10.1186/1471-2105-11-S9-S2
PMCID: PMC2967743  PMID: 21044360
11.  Structured association analysis leads to insight into Saccharomyces cerevisiae gene regulation by finding multiple contributing eQTL hotspots associated with functional gene modules 
BMC Genomics  2013;14:196.
Background
Association analysis using genome-wide expression quantitative trait locus (eQTL) data investigates the effect that genetic variation has on cellular pathways and leads to the discovery of candidate regulators. Traditional analysis of eQTL data via pairwise statistical significance tests or linear regression does not leverage the availability of the structural information of the transcriptome, such as presence of gene networks that reveal correlation and potentially regulatory relationships among the study genes. We employ a new eQTL mapping algorithm, GFlasso, which we have previously developed for sparse structured regression, to reanalyze a genome-wide yeast dataset. GFlasso fully takes into account the dependencies among expression traits to suppress false positives and to enhance the signal/noise ratio. Thus, GFlasso leverages the gene-interaction network to discover the pleiotropic effects of genetic loci that perturb the expression level of multiple (rather than individual) genes, which enables us to gain more power in detecting previously neglected signals that are marginally weak but pleiotropically significant.
Results
While eQTL hotspots in yeast have been reported previously as genomic regions controlling multiple genes, our analysis reveals additional novel eQTL hotspots and, more interestingly, uncovers groups of multiple contributing eQTL hotspots that affect the expression level of functional gene modules. To our knowledge, our study is the first to report this type of gene regulation stemming from multiple eQTL hotspots. Additionally, we report the results from in-depth bioinformatics analysis for three groups of these eQTL hotspots: ribosome biogenesis, telomere silencing, and retrotransposon biology. We suggest candidate regulators for the functional gene modules that map to each group of hotspots. Not only do we find that many of these candidate regulators contain mutations in the promoter and coding regions of the genes, in the case of the Ribi group, we provide experimental evidence suggesting that the identified candidates do regulate the target genes predicted by GFlasso.
Conclusions
Thus, this structured association analysis of a yeast eQTL dataset via GFlasso, coupled with extensive bioinformatics analysis, discovers a novel regulation pattern between multiple eQTL hotspots and functional gene modules. Furthermore, this analysis demonstrates the potential of GFlasso as a powerful computational tool for eQTL studies that exploit the rich structural information among expression traits due to correlation, regulation, or other forms of biological dependencies.
doi:10.1186/1471-2164-14-196
PMCID: PMC3616858  PMID: 23514438
12.  Integration of disease-specific single nucleotide polymorphisms, expression quantitative trait loci and coexpression networks reveal novel candidate genes for type 2 diabetes 
Diabetologia  2012;55(8):2205-2213.
Aims/hypothesis
While genome-wide association studies (GWASs) have been successful in identifying novel variants associated with various diseases, it has been much more difficult to determine the biological mechanisms underlying these associations. Expression quantitative trait loci (eQTL) provide another dimension to these data by associating single nucleotide polymorphisms (SNPs) with gene expression. We hypothesised that integrating SNPs known to be associated with type 2 diabetes with eQTLs and coexpression networks would enable the discovery of novel candidate genes for type 2 diabetes.
Methods
We selected 32 SNPs associated with type 2 diabetes in two or more independent GWASs. We used previously described eQTLs mapped from genotype and gene expression data collected from 1,008 morbidly obese patients to find genes with expression associated with these SNPs. We linked these genes to coexpression modules, and ranked the other genes in these modules using an inverse sum score.
Results
We found 62 genes with expression associated with type 2 diabetes SNPs. We validated our method by linking highly ranked genes in the coexpression modules back to SNPs through a combined eQTL dataset. We showed that the eQTLs highlighted by this method are significantly enriched for association with type 2 diabetes in data from the Wellcome Trust Case Control Consortium (WTCCC, p = 0.026) and the Gene Environment Association Studies (GENEVA, p = 0.042), validating our approach. Many of the highly ranked genes are also involved in the regulation or metabolism of insulin, glucose or lipids.
Conclusions/interpretation
We have devised a novel method, involving the integration of datasets of different modalities, to discover novel candidate genes for type 2 diabetes.
doi:10.1007/s00125-012-2568-3
PMCID: PMC3390705  PMID: 22584726
Genetics of type 2 diabetes; Genomics/proteomics; Mathematical modelling and simulation
13.  Heritability and Tissue Specificity of Expression Quantitative Trait Loci 
PLoS Genetics  2006;2(10):e172.
Variation in gene expression is heritable and has been mapped to the genome in humans and model organisms as expression quantitative trait loci (eQTLs). We applied integrated genome-wide expression profiling and linkage analysis to the regulation of gene expression in fat, kidney, adrenal, and heart tissues using the BXH/HXB panel of rat recombinant inbred strains. Here, we report the influence of heritability and allelic effect of the quantitative trait locus on detection of cis- and trans-acting eQTLs and discuss how these factors operate in a tissue-specific context. We identified several hundred major eQTLs in each tissue and found that cis-acting eQTLs are highly heritable and easier to detect than trans-eQTLs. The proportion of heritable expression traits was similar in all tissues; however, heritability alone was not a reliable predictor of whether an eQTL will be detected. We empirically show how the use of heritability as a filter reduces the ability to discover trans-eQTLs, particularly for eQTLs with small effects. Only 3% of cis- and trans-eQTLs exhibited large allelic effects, explaining more than 40% of the phenotypic variance, suggestive of a highly polygenic control of gene expression. Power calculations indicated that, across tissues, minor differences in genetic effects are expected to have a significant impact on detection of trans-eQTLs. Trans-eQTLs generally show smaller effects than cis-eQTLs and have a higher false discovery rate, particularly in more heterogeneous tissues, suggesting that small biological variability, likely relating to tissue composition, may influence detection of trans-eQTLs in this system. We delineate the effects of genetic architecture on variation in gene expression and show the sensitivity of this experimental design to tissue sampling variability in large-scale eQTL studies.
Synopsis
The combined application of genome-wide expression profiling from microarray experiments with genetic linkage analysis enables the mapping of expression quantitative trait loci (eQTLs), which are primary control points for gene expression across the genome. This approach has been called “genetical genomics”, and recent technological and methodological advances have made its large-scale application feasible in humans and model organisms. Using this approach, the authors have carried out an extensive analysis of the genetic architecture underlying variation in gene expression using a panel of 30 rat recombinant inbred strains. The results are used to explore the relationship between heritability of gene expression, cis- and trans-acting genetic effects, tissue heterogeneity, and statistical cut-offs of significance, which are important factors for large-scale eQTL studies. By examining large eQTL data from four tissues, the authors provide a detailed picture of cis- and trans-eQTL features that may help understanding of the genetic regulation of transcription on a genomic scale. The results also show the sensitivity of this approach to discriminate between cis and trans regulation and the value of the rat system in studying large eQTL datasets from multiple tissues.
doi:10.1371/journal.pgen.0020172
PMCID: PMC1617131  PMID: 17054398
14.  Heritability and Tissue Specificity of Expression Quantitative Trait Loci 
PLoS Genetics  2006;2(10):e172.
Variation in gene expression is heritable and has been mapped to the genome in humans and model organisms as expression quantitative trait loci (eQTLs). We applied integrated genome-wide expression profiling and linkage analysis to the regulation of gene expression in fat, kidney, adrenal, and heart tissues using the BXH/HXB panel of rat recombinant inbred strains. Here, we report the influence of heritability and allelic effect of the quantitative trait locus on detection of cis- and trans-acting eQTLs and discuss how these factors operate in a tissue-specific context. We identified several hundred major eQTLs in each tissue and found that cis-acting eQTLs are highly heritable and easier to detect than trans-eQTLs. The proportion of heritable expression traits was similar in all tissues; however, heritability alone was not a reliable predictor of whether an eQTL will be detected. We empirically show how the use of heritability as a filter reduces the ability to discover trans-eQTLs, particularly for eQTLs with small effects. Only 3% of cis- and trans-eQTLs exhibited large allelic effects, explaining more than 40% of the phenotypic variance, suggestive of a highly polygenic control of gene expression. Power calculations indicated that, across tissues, minor differences in genetic effects are expected to have a significant impact on detection of trans-eQTLs. Trans-eQTLs generally show smaller effects than cis-eQTLs and have a higher false discovery rate, particularly in more heterogeneous tissues, suggesting that small biological variability, likely relating to tissue composition, may influence detection of trans-eQTLs in this system. We delineate the effects of genetic architecture on variation in gene expression and show the sensitivity of this experimental design to tissue sampling variability in large-scale eQTL studies.
Synopsis
The combined application of genome-wide expression profiling from microarray experiments with genetic linkage analysis enables the mapping of expression quantitative trait loci (eQTLs), which are primary control points for gene expression across the genome. This approach has been called “genetical genomics”, and recent technological and methodological advances have made its large-scale application feasible in humans and model organisms. Using this approach, the authors have carried out an extensive analysis of the genetic architecture underlying variation in gene expression using a panel of 30 rat recombinant inbred strains. The results are used to explore the relationship between heritability of gene expression, cis- and trans-acting genetic effects, tissue heterogeneity, and statistical cut-offs of significance, which are important factors for large-scale eQTL studies. By examining large eQTL data from four tissues, the authors provide a detailed picture of cis- and trans-eQTL features that may help understanding of the genetic regulation of transcription on a genomic scale. The results also show the sensitivity of this approach to discriminate between cis and trans regulation and the value of the rat system in studying large eQTL datasets from multiple tissues.
doi:10.1371/journal.pgen.0020172
PMCID: PMC1617131  PMID: 17054398
15.  Integrative Genomics: Quantifying Significance of Phenotype-Genotype Relationships from Multiple Sources of High-Throughput Data 
Frontiers in Genetics  2013;3:202.
Given recent advances in the generation of high-throughput data such as whole-genome genetic variation and transcriptome expression, it is critical to come up with novel methods to integrate these heterogeneous datasets and to assess the significance of identified phenotype-genotype relationships. Recent studies show that genome-wide association findings are likely to fall in loci with gene regulatory effects such as expression quantitative trait loci (eQTLs), demonstrating the utility of such integrative approaches. When genotype and gene expression data are available on the same individuals, we and others developed methods wherein top phenotype-associated genetic variants are prioritized if they are associated, as eQTLs, with gene expression traits that are themselves associated with the phenotype. Yet there has been no method to determine an overall p-value for the findings that arise specifically from the integrative nature of the approach. We propose a computationally feasible permutation method that accounts for the assimilative nature of the method and the correlation structure among gene expression traits and among genotypes. We apply the method to data from a study of cellular sensitivity to etoposide, one of the most widely used chemotherapeutic drugs. To our knowledge, this study is the first statistically sound quantification of the overall significance of the genotype-phenotype relationships resulting from applying an integrative approach. This method can be easily extended to cases in which gene expression data are replaced by other molecular phenotypes of interest, e.g., microRNA or proteomic data. This study has important implications for studies seeking to expand on genetic association studies by the use of omics data. Finally, we provide an R code to compute the empirical false discovery rate when p-values for the observed and simulated phenotypes are available.
doi:10.3389/fgene.2012.00202
PMCID: PMC3668276  PMID: 23755062
eQTLs; FDR; gene expression; genomics; GWAS; integrative genomics; permutation; phenotype
16.  Brain eQTL Mapping Informs Genetic Studies of Psychiatric Diseases 
Neuroscience bulletin  2011;27(2):123-133.
Genome-wide association studies (GWASs) have been used to identify genes that increase risk of psychiatric diseases. However, much of the variation in disease risk is still unexplained, suggesting that there are genes still to be discovered. Functional annotation of genetic variants may increase the power of GWASs to identify disease genes by providing prior information that can be used in Bayesian analysis or in reducing the number of tests. Genetic mapping of expression quantitative trait loci (eQTLs) is helping us to reveal novel functional effects of thousands of single nucleotide polymorphisms (SNPs). The published brain eQTL studies are reviewed here, and major methodological issues and their possible solutions are discussed. We emphasize the frequently-ignored problems of batch effects, covariates, and multiple testing, all of which can lead to false positives and false negatives. The future application of eQTL data to the GWAS analysis is also discussed.
PMCID: PMC3074249  PMID: 21441974
genome-wide association study; brain; psychiatric diseases; eQTL; genetics; SNP
17.  Using eQTL weights to improve power for genome-wide association studies: a genetic study of childhood asthma 
Frontiers in Genetics  2013;4:103.
Increasing evidence suggests that single nucleotide polymorphisms (SNPs) associated with complex traits are more likely to be expression quantitative trait loci (eQTLs). Incorporating eQTL information hence has potential to increase power of genome-wide association studies (GWAS). In this paper, we propose using eQTL weights as prior information in SNP based association tests to improve test power while maintaining control of the family-wise error rate (FWER) or the false discovery rate (FDR). We apply the proposed methods to the analysis of a GWAS for childhood asthma consisting of 1296 unrelated individuals with German ancestry. The results confirm that eQTLs are enriched for previously reported asthma SNPs. We also find that some SNPs are insignificant using procedures without eQTL weighting, but become significant using eQTL-weighted Bonferroni or Benjamini–Hochberg procedures, while controlling the same FWER or FDR level. Some of these SNPs have been reported by independent studies in recent literature. The results suggest that the eQTL-weighted procedures provide a promising approach for improving power of GWAS. We also report the results of our methods applied to the large-scale European GABRIEL consortium data.
doi:10.3389/fgene.2013.00103
PMCID: PMC3668139  PMID: 23755072
asthma; family-wise error rate; false discovery rate; eQTL; genome-wide association study; weighted hypothesis test
18.  Identification of Genes and Networks Driving Cardiovascular and Metabolic Phenotypes in a Mouse F2 Intercross 
PLoS ONE  2010;5(12):e14319.
To identify the genes and pathways that underlie cardiovascular and metabolic phenotypes we performed an integrated analysis of a mouse C57BL/6J x A/J F2 (B6AF2) cross by relating genome-wide gene expression data from adipose, kidney, and liver tissues to physiological endpoints measured in the population. We have identified a large number of trait QTLs including loci driving variation in cardiac function on chromosomes 2 and 6 and a hotspot for adiposity, energy metabolism, and glucose traits on chromosome 8. Integration of adipose gene expression data identified a core set of genes that drive the chromosome 8 adiposity QTL. This chromosome 8 trans eQTL signature contains genes associated with mitochondrial function and oxidative phosphorylation and maps to a subnetwork with conserved function in humans that was previously implicated in human obesity. In addition, human eSNPs corresponding to orthologous genes from the signature show enrichment for association to type II diabetes in the DIAGRAM cohort, supporting the idea that the chromosome 8 locus perturbs a molecular network that in humans senses variations in DNA and in turn affects metabolic disease risk. We functionally validate predictions from this approach by demonstrating metabolic phenotypes in knockout mice for three genes from the trans eQTL signature, Akr1b8, Emr1, and Rgs2. In addition we show that the transcriptional signatures for knockout of two of these genes, Akr1b8 and Rgs2, map to the F2 network modules associated with the chromosome 8 trans eQTL signature and that these modules are in turn very significantly correlated with adiposity in the F2 population. Overall this study demonstrates how integrating gene expression data with QTL analysis in a network-based framework can aid in the elucidation of the molecular drivers of disease that can be translated from mice to humans.
doi:10.1371/journal.pone.0014319
PMCID: PMC3001864  PMID: 21179467
19.  Graph theoretical approach to study eQTL: a case study of Plasmodium falciparum 
Bioinformatics  2009;25(12):i15-i20.
Motivation: Analysis of expression quantitative trait loci (eQTL) significantly contributes to the determination of gene regulation programs. However, the discovery and analysis of associations of gene expression levels and their underlying sequence polymorphisms continue to pose many challenges. Methods are limited in their ability to illuminate the full structure of the eQTL data. Most rely on an exhaustive, genome scale search that considers all possible locus–gene pairs and tests the linkage between each locus and gene.
Result: To analyze eQTLs in a more comprehensive and efficient way, we developed the Graph based eQTL Decomposition method (GeD) that allows us to model genotype and expression data using an eQTL association graph. Through graph-based heuristics, GeD identifies dense subgraphs in the eQTL association graph. By identifying eQTL association cliques that expose the hidden structure of genotype and expression data, GeD effectively filters out most locus–gene pairs that are unlikely to have significant linkage. We apply GeD on eQTL data from Plasmodium falciparum, the human malaria parasite, and show that GeD reveals the structure of the relationship between all loci and all genes on a whole genome level. Furthermore, GeD allows us to uncover additional eQTLs with lower FDR, providing an important complement to traditional eQTL analysis methods.
Contact: przytyck@ncbi.nlm.nih.gov
doi:10.1093/bioinformatics/btp189
PMCID: PMC2687943  PMID: 19477981
20.  Impact of common regulatory single-nucleotide variants on gene expression profiles in whole blood 
Genome-wide association studies (GWASs) have uncovered susceptibility loci for a large number of complex traits. Functional interpretation of candidate genes identified by GWAS and confident assignment of the causal variant still remains a major challenge. Expression quantitative trait (eQTL) mapping has facilitated identification of risk loci for quantitative traits and might allow prioritization of GWAS candidate genes. One major challenge of eQTL studies is the need for larger sample numbers and replication. The aim of this study was to evaluate the robustness and reproducibility of whole-blood eQTLs in humans and test their value in the identification of putative functional variants involved in the etiology of complex traits. In the current study, we performed comphrehensive eQTL mapping from whole blood. The discovery sample included 322 Caucasians from a general population sample (KORA F3). We identified 363 cis and 8 trans eQTLs after stringent Bonferroni correction for multiple testing. Of these, 98.6% and 50% of cis and trans eQTLs, respectively, could be replicated in two independent populations (KORA F4 (n=740) and SHIP-TREND (n=653)). Furthermore, we identified evidence of regulatory variation for SNPs previously reported to be associated with disease loci (n=59) or quantitative trait loci (n=20), indicating a possible functional mechanism for these eSNPs. Our data demonstrate that eQTLs in whole blood are highly robust and reproducible across studies and highlight the relevance of whole-blood eQTL mapping in prioritization of GWAS candidate genes in humans.
doi:10.1038/ejhg.2012.106
PMCID: PMC3522194  PMID: 22692066
gene expression; eQTL; GWAS; whole blood
21.  SERPINA2 Is a Novel Gene with a Divergent Function from SERPINA1 
PLoS ONE  2013;8(6):e66889.
Serine protease inhibitors (SERPINs) are a superfamily of highly conserved proteins that play a key role in controlling the activity of proteases in diverse biological processes. The SERPIN cluster located at the 14q32.1 region includes the gene coding for SERPINA1, and a highly homologous sequence, SERPINA2, which was originally thought to be a pseudogene. We have previously shown that SERPINA2 is expressed in different tissues, namely leukocytes and testes, suggesting that it is a functional SERPIN. To investigate the function of SERPINA2, we used HeLa cells stably transduced with the different variants of SERPINA2 and SERPINA1 (M1, S and Z) and leukocytes as the in vivo model. We identified SERPINA2 as a 52 kDa intracellular glycoprotein, which is localized at the endoplasmic reticulum (ER), independently of the variant analyzed. SERPINA2 is not significantly regulated by proteasome, proposing that ER localization is not due to misfolding. Specific features of SERPINA2 include the absence of insoluble aggregates and the insignificant response to cell stress, suggesting that it is a non-polymerogenic protein with divergent activity of SERPINA1. Using phylogenetic analysis, we propose an origin of SERPINA2 in the crown of primates, and we unveiled the overall conservation of SERPINA2 and A1. Nonetheless, few SERPINA2 residues seem to have evolved faster, contributing to the emergence of a new advantageous function, possibly as a chymotrypsin-like SERPIN. Herein, we present evidences that SERPINA2 is an active gene, coding for an ER-resident protein, which may act as substrate or adjuvant of ER-chaperones.
doi:10.1371/journal.pone.0066889
PMCID: PMC3691238  PMID: 23826168
22.  Post-GWAS Functional Characterization of Susceptibility Variants for Chronic Lymphocytic Leukemia 
PLoS ONE  2012;7(1):e29632.
Recent genome-wide association studies (GWAS) have identified several gene variants associated with sporadic chronic lymphocytic leukemia/small lymphocytic lymphoma (CLL/SLL). Many of these CLL/SLL susceptibility loci are located in non-coding or intergenic regions, posing a significant challenge to determine their potential functional relevance. Here, we review the literature of all CLL/SLL GWAS and validation studies, and apply eQTL analysis to identify putatively functional SNPs that affect gene expression that may be causal in the pathogenesis of CLL/SLL. We tested 12 independent risk loci for their potential to alter gene expression through cis-acting mechanisms, using publicly available gene expression profiles with matching genotype information. Sixteen SNPs were identified that are linked to differential expression of SP140, a putative tumor suppressor gene previously associated with CLL/SLL. Three additional SNPs were associated with differential expression of DACT3 and GNG8, which are involved in the WNT/β-catenin- and G protein-coupled receptor signaling pathways, respectively, that have been previously implicated in CLL/SLL pathogenesis. Using in silico functional prediction tools, we found that 14 of the 19 significant eQTL SNPs lie in multiple putative regulatory elements, several of which have prior implications in CLL/SLL or other hematological malignancies. Although experimental validation is needed, our study shows that the use of existing GWAS data in combination with eQTL analysis and in silico methods represents a useful starting point to screen for putatively causal SNPs that may be involved in the etiology of CLL/SLL.
doi:10.1371/journal.pone.0029632
PMCID: PMC3250464  PMID: 22235315
23.  Genome-Wide Co-Expression Analysis in Multiple Tissues 
PLoS ONE  2008;3(12):e4033.
Expression quantitative trait loci (eQTLs) represent genetic control points of gene expression, and can be categorized as cis- and trans-acting, reflecting local and distant regulation of gene expression respectively. Although there is evidence of co-regulation within clusters of trans-eQTLs, the extent of co-expression patterns and their relationship with the genotypes at eQTLs are not fully understood. We have mapped thousands of cis- and trans-eQTLs in four tissues (fat, kidney, adrenal and left ventricle) in a large panel of rat recombinant inbred (RI) strains. Here we investigate the genome-wide correlation structure in expression levels of eQTL transcripts and underlying genotypes to elucidate the nature of co-regulation within cis- and trans-eQTL datasets. Across the four tissues, we consistently found statistically significant correlations of cis-regulated gene expression to be rare (<0.9% of all pairs tested). Most (>80%) of the observed significant correlations of cis-regulated gene expression are explained by correlation of the underlying genotypes. In comparison, co-expression of trans-regulated gene expression is more common, with significant correlation ranging from 2.9%–14.9% of all pairs of trans-eQTL transcripts. We observed a total of 81 trans-eQTL clusters (hot-spots), defined as consisting of ≥10 eQTLs linked to a common region, with very high levels of correlation between trans-regulated transcripts (77.2–90.2%). Moreover, functional analysis of large trans-eQTL clusters (≥30 eQTLs) revealed significant functional enrichment among genes comprising 80% of the large clusters. The results of this genome-wide co-expression study show the effects of the eQTL genotypes on the observed patterns of correlation, and suggest that functional relatedness between genes underlying trans-eQTLs is reflected in the degree of co-expression observed in trans-eQTL clusters. Our results demonstrate the power of an integrative, systematic approach to the analysis of a large gene expression dataset to uncover underlying structure, and inform future eQTL studies.
doi:10.1371/journal.pone.0004033
PMCID: PMC2603584  PMID: 19112506
24.  An integrative functional genomics approach for discovering biomarkers in schizophrenia 
Briefings in Functional Genomics  2011;10(6):387-399.
Schizophrenia (SZ) is a complex disorder resulting from both genetic and environmental causes with a lifetime prevalence world-wide of 1%; however, there are no specific, sensitive and validated biomarkers for SZ. A general unifying hypothesis has been put forward that disease-associated single nucleotide polymorphisms (SNPs) from genome-wide association study (GWAS) are more likely to be associated with gene expression quantitative trait loci (eQTL). We will describe this hypothesis and review primary methodology with refinements for testing this paradigmatic approach in SZ. We will describe biomarker studies of SZ and testing enrichment of SNPs that are associated both with eQTLs and existing GWAS of SZ. SZ-associated SNPs that overlap with eQTLs can be placed into gene–gene expression, protein–protein and protein–DNA interaction networks. Further, those networks can be tested by reducing/silencing the gene expression levels of critical nodes. We present pilot data to support these methods of investigation such as the use of eQTLs to annotate GWASs of SZ, which could be applied to the field of biomarker discovery. Those networks that have association with SNP markers, especially cis-regulated expression, might lead to a more clear understanding of important candidate genes that predispose to disease and alter expression. This method has general application to many complex disorders.
doi:10.1093/bfgp/elr036
PMCID: PMC3277082  PMID: 22155586
expression quantitative trait loci; cis-regulatory SNPs; GWAS; gene expression; lymphoblastoid cell lines
25.  Candidate Causal Regulatory Effects by Integration of Expression QTLs with Complex Trait Genetic Associations 
PLoS Genetics  2010;6(4):e1000895.
The recent success of genome-wide association studies (GWAS) is now followed by the challenge to determine how the reported susceptibility variants mediate complex traits and diseases. Expression quantitative trait loci (eQTLs) have been implicated in disease associations through overlaps between eQTLs and GWAS signals. However, the abundance of eQTLs and the strong correlation structure (LD) in the genome make it likely that some of these overlaps are coincidental and not driven by the same functional variants. In the present study, we propose an empirical methodology, which we call Regulatory Trait Concordance (RTC) that accounts for local LD structure and integrates eQTLs and GWAS results in order to reveal the subset of association signals that are due to cis eQTLs. We simulate genomic regions of various LD patterns with both a single or two causal variants and show that our score outperforms SNP correlation metrics, be they statistical (r2) or historical (D'). Following the observation of a significant abundance of regulatory signals among currently published GWAS loci, we apply our method with the goal to prioritize relevant genes for each of the respective complex traits. We detect several potential disease-causing regulatory effects, with a strong enrichment for immunity-related conditions, consistent with the nature of the cell line tested (LCLs). Furthermore, we present an extension of the method in trans, where interrogating the whole genome for downstream effects of the disease variant can be informative regarding its unknown primary biological effect. We conclude that integrating cellular phenotype associations with organismal complex traits will facilitate the biological interpretation of the genetic effects on these traits.
Author Summary
Genome-wide association studies have led to the identification of susceptibility loci for a variety of human complex traits. What is still largely missing, however, is the understanding of the biological context in which these candidate variants act and of how they determine each trait. Given the localization of many GWAS loci outside coding regions and the important role of regulatory variation in shaping phenotypic variance, gene expression has been proposed as a plausible informative intermediate phenotype. Here we show that for a subset of the currently published GWAS this is indeed the case, by observing a significant excess of regulatory variants among disease loci. We propose an empirical methodology (regulatory trait concordance—RTC) able to integrate expression and disease data in order to detect causal regulatory effects. We show that the RTC outperforms simple correlation metrics under various simulated linkage disequilibrium (LD) scenarios. Our method is able to recover previously suspected causal regulatory effects from the literature and, as expected given the nature of the tested tissue, an overrepresentation of immunity-related candidates is observed. As the number of available tissues will increase, this prioritization approach will become even more useful in understanding the implication of regulatory variants in disease etiology.
doi:10.1371/journal.pgen.1000895
PMCID: PMC2848550  PMID: 20369022

Results 1-25 (1017295)