Search tips
Search criteria

Results 1-25 (263818)

Clipboard (0)

Related Articles

1.  WormQTLHD—a web database for linking human disease to natural variation data in C. elegans 
Nucleic Acids Research  2013;42(Database issue):D794-D801.
Interactions between proteins are highly conserved across species. As a result, the molecular basis of multiple diseases affecting humans can be studied in model organisms that offer many alternative experimental opportunities. One such organism—Caenorhabditis elegans—has been used to produce much molecular quantitative genetics and systems biology data over the past decade. We present WormQTLHD (Human Disease), a database that quantitatively and systematically links expression Quantitative Trait Loci (eQTL) findings in C. elegans to gene–disease associations in man. WormQTLHD, available online at, is a user-friendly set of tools to reveal functionally coherent, evolutionary conserved gene networks. These can be used to predict novel gene-to-gene associations and the functions of genes underlying the disease of interest. We created a new database that links C. elegans eQTL data sets to human diseases (34 337 gene–disease associations from OMIM, DGA, GWAS Central and NHGRI GWAS Catalogue) based on overlapping sets of orthologous genes associated to phenotypes in these two species. We utilized QTL results, high-throughput molecular phenotypes, classical phenotypes and genotype data covering different developmental stages and environments from WormQTL database. All software is available as open source, built on MOLGENIS and xQTL workbench.
PMCID: PMC3965109  PMID: 24217915
2.  WormQTL—public archive and analysis web portal for natural variation data in Caenorhabditis spp 
Nucleic Acids Research  2012;41(Database issue):D738-D743.
Here, we present WormQTL (, an easily accessible database enabling search, comparative analysis and meta-analysis of all data on variation in Caenorhabditis spp. Over the past decade, Caenorhabditis elegans has become instrumental for molecular quantitative genetics and the systems biology of natural variation. These efforts have resulted in a valuable amount of phenotypic, high-throughput molecular and genotypic data across different developmental worm stages and environments in hundreds of C. elegans strains. WormQTL provides a workbench of analysis tools for genotype–phenotype linkage and association mapping based on but not limited to R/qtl ( All data can be uploaded and downloaded using simple delimited text or Excel formats and are accessible via a public web user interface for biologists and R statistic and web service interfaces for bioinformaticians, based on open source MOLGENIS and xQTL workbench software. WormQTL welcomes data submissions from other worm researchers.
PMCID: PMC3531126  PMID: 23180786
3.  R/qtl: high-throughput multiple QTL mapping 
Bioinformatics  2010;26(23):2990-2992.
Motivation: R/qtl is free and powerful software for mapping and exploring quantitative trait loci (QTL). R/qtl provides a fully comprehensive range of methods for a wide range of experimental cross types. We recently added multiple QTL mapping (MQM) to R/qtl. MQM adds higher statistical power to detect and disentangle the effects of multiple linked and unlinked QTL compared with many other methods. MQM for R/qtl adds many new features including improved handling of missing data, analysis of 10 000 s of molecular traits, permutation for determining significance thresholds for QTL and QTL hot spots, and visualizations for cis–trans and QTL interaction effects. MQM for R/qtl is the first free and open source implementation of MQM that is multi-platform, scalable and suitable for automated procedures and large genetical genomics datasets.
Availability: R/qtl is free and open source multi-platform software for the statistical language R, and is made available under the GPLv3 license. R/qtl can be installed from R/qtl queries should be directed at the mailing list, see
PMCID: PMC2982156  PMID: 20966004
4.  iBMQ: a R/Bioconductor package for integrated Bayesian modeling of eQTL data 
Bioinformatics  2013;29(21):2797-2798.
Motivation: Recently, mapping studies of expression quantitative loci (eQTL) (where gene expression levels are viewed as quantitative traits) have provided insight into the biology of gene regulation. Bayesian methods provide natural modeling frameworks for analyzing eQTL studies, where information shared across markers and/or genes can increase the power to detect eQTLs. Bayesian approaches tend to be computationally demanding and require specialized software. As a result, most eQTL studies use univariate methods treating each gene independently, leading to suboptimal results.
Results: We present a powerful, computationally optimized and free open-source R package, iBMQ. Our package implements a joint hierarchical Bayesian model where all genes and SNPs are modeled concurrently. Model parameters are estimated using a Markov chain Monte Carlo algorithm. The free and widely used openMP parallel library speeds up computation. Using a mouse cardiac dataset, we show that iBMQ improves the detection of large trans-eQTL hotspots compared with other state-of-the-art packages for eQTL analysis.
Availability: The R-package iBMQ is available from the Bioconductor Web site at and runs on Linux, Windows and MAC OS X. It is distributed under the Artistic Licence-2.0 terms.
Contact: or
Supplementary information: Supplementary data are available at Bioinformatics online.
PMCID: PMC3799478  PMID: 23958729
5.  Genome-Wide Linkage Analysis of Global Gene Expression in Loin Muscle Tissue Identifies Candidate Genes in Pigs 
PLoS ONE  2011;6(2):e16766.
Nearly 6,000 QTL have been reported for 588 different traits in pigs, more than in any other livestock species. However, this effort has translated into only a few confirmed causative variants. A powerful strategy for revealing candidate genes involves expression QTL (eQTL) mapping, where the mRNA abundance of a set of transcripts is used as the response variable for a QTL scan.
Methodology/Principal Findings
We utilized a whole genome expression microarray and an F2 pig resource population to conduct a global eQTL analysis in loin muscle tissue, and compared results to previously inferred phenotypic QTL (pQTL) from the same experimental cross. We found 62 unique eQTL (FDR <10%) and identified 3 gene networks enriched with genes subject to genetic control involved in lipid metabolism, DNA replication, and cell cycle regulation. We observed strong evidence of local regulation (40 out of 59 eQTL with known genomic position) and compared these eQTL to pQTL to help identify potential candidate genes. Among the interesting associations, we found aldo-keto reductase 7A2 (AKR7A2) and thioredoxin domain containing 12 (TXNDC12) eQTL that are part of a network associated with lipid metabolism and in turn overlap with pQTL regions for marbling, % intramuscular fat (% fat) and loin muscle area on Sus scrofa (SSC) chromosome 6. Additionally, we report 13 genomic regions with overlapping eQTL and pQTL involving 14 local eQTL.
Results of this analysis provide novel candidate genes for important complex pig phenotypes.
PMCID: PMC3035619  PMID: 21346809
6.  Synthesis of 53 tissue and cell line expression QTL datasets reveals master eQTLs 
BMC Genomics  2014;15(1):532.
Gene expression genetic studies in human tissues and cells identify cis- and trans-acting expression quantitative trait loci (eQTLs). These eQTLs provide insights into regulatory mechanisms underlying disease risk. However, few studies systematically characterized eQTL results across cell and tissues types. We synthesized eQTL results from >50 datasets, including new primary data from human brain, peripheral plaque and kidney samples, in order to discover features of human eQTLs.
We find a substantial number of robust cis-eQTLs and far fewer trans-eQTLs consistent across tissues. Analysis of 45 full human GWAS scans indicates eQTLs are enriched overall, and above nSNPs, among positive statistical signals in genetic mapping studies, and account for a significant fraction of the strongest human trait effects. Expression QTLs are enriched for gene centricity, higher population allele frequencies, in housekeeping genes, and for coincidence with regulatory features, though there is little evidence of 5′ or 3′ positional bias. Several regulatory categories are not enriched including microRNAs and their predicted binding sites and long, intergenic non-coding RNAs. Among the most tissue-ubiquitous cis-eQTLs, there is enrichment for genes involved in xenobiotic metabolism and mitochondrial function, suggesting these eQTLs may have adaptive origins. Several strong eQTLs (CDK5RAP2, NBPFs) coincide with regions of reported human lineage selection. The intersection of new kidney and plaque eQTLs with related GWAS suggest possible gene prioritization. For example, butyrophilins are now linked to arterial pathogenesis via multiple genetic and expression studies. Expression QTL and GWAS results are made available as a community resource through the NHLBI GRASP database [].
Expression QTLs inform the interpretation of human trait variability, and may account for a greater fraction of phenotypic variability than protein-coding variants. The synthesis of available tissue eQTL data highlights many strong cis-eQTLs that may have important biologic roles and could serve as positive controls in future studies. Our results indicate some strong tissue-ubiquitous eQTLs may have adaptive origins in humans. Efforts to expand the genetic, splicing and tissue coverage of known eQTLs will provide further insights into human gene regulation.
Electronic supplementary material
The online version of this article (doi:10.1186/1471-2164-15-532) contains supplementary material, which is available to authorized users.
PMCID: PMC4102726  PMID: 24973796
eQTL; RNA; Gene expression; Genomics; Transcriptome; GWAS; Genome-wide; Tissue; Cis; Trans
7.  solQTL: a tool for QTL analysis, visualization and linking to genomes at SGN database 
BMC Bioinformatics  2010;11:525.
A common approach to understanding the genetic basis of complex traits is through identification of associated quantitative trait loci (QTL). Fine mapping QTLs requires several generations of backcrosses and analysis of large populations, which is time-consuming and costly effort. Furthermore, as entire genomes are being sequenced and an increasing amount of genetic and expression data are being generated, a challenge remains: linking phenotypic variation to the underlying genomic variation. To identify candidate genes and understand the molecular basis underlying the phenotypic variation of traits, bioinformatic approaches are needed to exploit information such as genetic map, expression and whole genome sequence data of organisms in biological databases.
The Sol Genomics Network (SGN, is a primary repository for phenotypic, genetic, genomic, expression and metabolic data for the Solanaceae family and other related Asterids species and houses a variety of bioinformatics tools. SGN has implemented a new approach to QTL data organization, storage, analysis, and cross-links with other relevant data in internal and external databases. The new QTL module, solQTL,, employs a user-friendly web interface for uploading raw phenotype and genotype data to the database, R/QTL mapping software for on-the-fly QTL analysis and algorithms for online visualization and cross-referencing of QTLs to relevant datasets and tools such as the SGN Comparative Map Viewer and Genome Browser. Here, we describe the development of the solQTL module and demonstrate its application.
solQTL allows Solanaceae researchers to upload raw genotype and phenotype data to SGN, perform QTL analysis and dynamically cross-link to relevant genetic, expression and genome annotations. Exploration and synthesis of the relevant data is expected to help facilitate identification of candidate genes underlying phenotypic variation and markers more closely linked to QTLs. solQTL is freely available on SGN and can be used in private or public mode.
PMCID: PMC2984588  PMID: 20964836
8.  Expression quantitative trait loci infer the regulation of isoflavone accumulation in soybean (Glycine max L. Merr.) seed 
BMC Genomics  2014;15(1):680.
Mapping expression quantitative trait loci (eQTL) of targeted genes represents a powerful and widely adopted approach to identify putative regulatory variants. Linking regulation differences to specific genes might assist in the identification of networks and interactions. The objective of this study is to identify eQTL underlying expression of four gene families encoding isoflavone synthetic enzymes involved in the phenylpropanoid pathway, which are phenylalanine ammonia-lyase (PAL; EC, chalcone synthase (CHS; EC, 2-hydroxyisoflavanone synthase (IFS; EC1.14.13.136) and flavanone 3-hydroxylase (F3H; EC A population of 130 recombinant inbred lines (F5:11), derived from a cross between soybean cultivar ‘Zhongdou 27’ (high isoflavone) and ‘Jiunong 20’ (low isoflavone), and a total of 194 simple sequence repeat (SSR) markers were used in this study. Overlapped loci of eQTLs and phenotypic QTLs (pQTLs) were analyzed to identify the potential candidate genes underlying the accumulation of isoflavone in soybean seed.
Thirty three eQTLs (thirteen cis-eQTLs and twenty trans-eQTLs) underlying the transcript abundance of the four gene families were identified on fifteen chromosomes. The eQTLs between Satt278-Sat_134, Sat_134-Sct_010 and Satt149-Sat_234 underlie the expression of both IFS and CHS genes. Five eQTL intervals were overlapped with pQTLs. A total of eleven candidate genes within the overlapped eQTL and pQTL were identified.
These results will be useful for the development of marker-assisted selection to breed soybean cultivars with high or low isoflavone contents and for map-based cloning of new isoflavone related genes.
PMCID: PMC4138391  PMID: 25124843
Soybean; eQTL; Isoflavone; pQTL; Candidate genes
9.  Mapping epistatic quantitative trait loci 
BMC Genetics  2014;15(1):112.
How to map quantitative trait loci (QTL) with epistasis efficiently and reliably has been a persistent problem for QTL mapping analysis. There are a number of difficulties for studying epistatic QTL. Linkage can impose a significant challenge for finding epistatic QTL reliably. If multiple QTL are in linkage and have interactions, searching for QTL can become a very delicate issue. A commonly used strategy that performs a two-dimensional genome scan to search for a pair of QTL with epistasis can suffer from low statistical power and also may lead to false identification due to complex linkage disequilibrium and interaction patterns.
To tackle the problem of complex interaction of multiple QTL with linkage, we developed a three-stage search strategy. In the first stage, main effect QTL are searched and mapped. In the second stage, epistatic QTL that interact significantly with other identified QTL are searched. In the third stage, new epistatic QTL are searched in pairs. This strategy is based on the consideration that most genetic variance is due to the main effects of QTL. Thus by first mapping those main-effect QTL, the statistical power for the second and third stages of analysis for mapping epistatic QTL can be maximized. The search for main effect QTL is robust and does not bias the search for epistatic QTL due to a genetic property associated with the orthogonal genetic model that the additive and additive by additive variances are independent despite of linkage. The model search criterion is empirically and dynamically evaluated by using a score-statistic based resampling procedure. We demonstrate through simulations that the method has good power and low false positive in the identification of QTL and epistasis.
This method provides an effective and powerful solution to map multiple QTL with complex epistatic pattern. The method has been implemented in the user-friendly computer software Windows QTL Cartographer. This will greatly facilitate the application of the method for QTL mapping data analysis.
Electronic supplementary material
The online version of this article (doi:10.1186/s12863-014-0112-9) contains supplementary material, which is available to authorized users.
PMCID: PMC4226885  PMID: 25367219
Quantitative trait loci; Epistasis; Model selection; Sequential search
10.  Bioinformatics tools and database resources for systems genetics analysis in mice—a short review and an evaluation of future needs 
Briefings in Bioinformatics  2011;13(2):135-142.
During a meeting of the SYSGENET working group ‘Bioinformatics’, currently available software tools and databases for systems genetics in mice were reviewed and the needs for future developments discussed. The group evaluated interoperability and performed initial feasibility studies. To aid future compatibility of software and exchange of already developed software modules, a strong recommendation was made by the group to integrate HAPPY and R/qtl analysis toolboxes, GeneNetwork and XGAP database platforms, and TIQS and xQTL processing platforms. R should be used as the principal computer language for QTL data analysis in all platforms and a ‘cloud’ should be used for software dissemination to the community. Furthermore, the working group recommended that all data models and software source code should be made visible in public repositories to allow a coordinated effort on the use of common data structures and file formats.
PMCID: PMC3294237  PMID: 22396485
QTL mapping; database; mouse; systems genetics
11.  A Bayesian Framework to Account for Complex Non-Genetic Factors in Gene Expression Levels Greatly Increases Power in eQTL Studies 
PLoS Computational Biology  2010;6(5):e1000770.
Gene expression measurements are influenced by a wide range of factors, such as the state of the cell, experimental conditions and variants in the sequence of regulatory regions. To understand the effect of a variable of interest, such as the genotype of a locus, it is important to account for variation that is due to confounding causes. Here, we present VBQTL, a probabilistic approach for mapping expression quantitative trait loci (eQTLs) that jointly models contributions from genotype as well as known and hidden confounding factors. VBQTL is implemented within an efficient and flexible inference framework, making it fast and tractable on large-scale problems. We compare the performance of VBQTL with alternative methods for dealing with confounding variability on eQTL mapping datasets from simulations, yeast, mouse, and human. Employing Bayesian complexity control and joint modelling is shown to result in more precise estimates of the contribution of different confounding factors resulting in additional associations to measured transcript levels compared to alternative approaches. We present a threefold larger collection of cis eQTLs than previously found in a whole-genome eQTL scan of an outbred human population. Altogether, 27% of the tested probes show a significant genetic association in cis, and we validate that the additional eQTLs are likely to be real by replicating them in different sets of individuals. Our method is the next step in the analysis of high-dimensional phenotype data, and its application has revealed insights into genetic regulation of gene expression by demonstrating more abundant cis-acting eQTLs in human than previously shown. Our software is freely available online at
Author Summary
Gene expression is a complex phenotype. The measured expression level in an experiment can be affected by a wide range of factors—state of the cell, experimental conditions, variants in the sequence of regulatory regions, and others. To understand genotype-to-phenotype relationships, we need to be able to distinguish the variation that is due to the genetic state from all the confounding causes. We present VBQTL, a probabilistic method for dissecting gene expression variation by jointly modelling the underlying global causes of variability and the genetic effect. Our method is implemented in a flexible framework that allows for quick model adaptation and comparison with alternative models. The probabilistic approach yields more accurate estimates of the contributions from different sources of variation. Applying VBQTL, we find that common genetic variation controlling gene expression levels in human is more abundant than previously shown, which has implications for a wide range of studies relating genotype to phenotype.
PMCID: PMC2865505  PMID: 20463871
12.  Clusthaplo: a plug-in for MCQTL to enhance QTL detection using ancestral alleles in multi-cross design 
Key message
We enhance power and accuracy of QTL mapping in multiple related families, by clustering the founders of the families on their local genomic similarity.
MCQTL is a linkage mapping software application that allows the joint QTL mapping of multiple related families. In its current implementation, QTLs are modeled with one or two parameters for each parent that is a founder of the multi-cross design. The higher the number of parents, the higher the number of model parameters which can impact the power and the accuracy of the mapping. We propose to make use of the availability of denser and denser genotyping information on the founders to lessen the number of MCQTL parameters and thus boost the QTL discovery. We developed clusthaplo, an R package (, which aims to cluster haplotypes using a genomic similarity that reflects the probability of sharing the same ancestral allele. Computed in a sliding window along the genome and followed by a clustering method, the genomic similarity allows the local clustering of the parent haplotypes. Our assumption is that the haplotypes belonging to the same class transmit the same ancestral allele. So their putative QTL allelic effects can be modeled with the same parameter, leading to a parsimonious model, that is plugged in MCQTL. Intensive simulations using three maize data sets showed the significant gain in power and in accuracy of the QTL mapping with the ancestral allele model compared to the classical MCQTL model. MCQTL_LD (clusthaplo outputs plug in MCQTL) is a versatile and powerful tool for QTL mapping in multiple related families that makes use of linkage and linkage disequilibrium (web site
Electronic supplementary material
The online version of this article (doi:10.1007/s00122-014-2267-1) contains supplementary material, which is available to authorized users.
PMCID: PMC3964294  PMID: 24482114
13.  The MOLGENIS toolkit: rapid prototyping of biosoftware at the push of a button 
BMC Bioinformatics  2010;11(Suppl 12):S12.
There is a huge demand on bioinformaticians to provide their biologists with user friendly and scalable software infrastructures to capture, exchange, and exploit the unprecedented amounts of new *omics data. We here present MOLGENIS, a generic, open source, software toolkit to quickly produce the bespoke MOLecular GENetics Information Systems needed.
The MOLGENIS toolkit provides bioinformaticians with a simple language to model biological data structures and user interfaces. At the push of a button, MOLGENIS’ generator suite automatically translates these models into a feature-rich, ready-to-use web application including database, user interfaces, exchange formats, and scriptable interfaces. Each generator is a template of SQL, JAVA, R, or HTML code that would require much effort to write by hand. This ‘model-driven’ method ensures reuse of best practices and improves quality because the modeling language and generators are shared between all MOLGENIS applications, so that errors are found quickly and improvements are shared easily by a re-generation. A plug-in mechanism ensures that both the generator suite and generated product can be customized just as much as hand-written software.
In recent years we have successfully evaluated the MOLGENIS toolkit for the rapid prototyping of many types of biomedical applications, including next-generation sequencing, GWAS, QTL, proteomics and biobanking. Writing 500 lines of model XML typically replaces 15,000 lines of hand-written programming code, which allows for quick adaptation if the information system is not yet to the biologist’s satisfaction. Each application generated with MOLGENIS comes with an optimized database back-end, user interfaces for biologists to manage and exploit their data, programming interfaces for bioinformaticians to script analysis tools in R, Java, SOAP, REST/JSON and RDF, a tab-delimited file format to ease upload and exchange of data, and detailed technical documentation. Existing databases can be quickly enhanced with MOLGENIS generated interfaces using the ‘ExtractModel’ procedure.
The MOLGENIS toolkit provides bioinformaticians with a simple model to quickly generate flexible web platforms for all possible genomic, molecular and phenotypic experiments with a richness of interfaces not provided by other tools. All the software and manuals are available free as LGPLv3 open source at
PMCID: PMC3040526  PMID: 21210979
14.  Allele-specific expression and eQTL analysis in mouse adipose tissue 
BMC Genomics  2014;15(1):471.
The simplest definition of cis-eQTLs versus trans, refers to genetic variants that affect expression in an allele specific manner, with implications on underlying mechanism. Yet, due to technical limitations of expression microarrays, the vast majority of eQTL studies performed in the last decade used a genomic distance based definition as a surrogate for cis, therefore exploring local rather than cis-eQTLs.
In this study we use RNAseq to explore allele specific expression (ASE) in adipose tissue of male and female F1 mice, produced from reciprocal crosses of C57BL/6J and DBA/2J strains. Comparison of the identified cis-eQTLs, to local-eQTLs, that were obtained from adipose tissue expression in two previous population based studies in our laboratory, yields poor overlap between the two mapping approaches, while both local-eQTL studies show highly concordant results. Specifically, local-eQTL studies show ~60% overlap between themselves, while only 15-20% of local-eQTLs are identified as cis by ASE, and less than 50% of ASE genes are recovered in local-eQTL studies. Utilizing recently published ENCODE data, we also find that ASE genes show significant bias for SNPs prevalence in DNase I hypersensitive sites that is ASE direction specific.
We suggest a new approach to analysis of allele specific expression that is more sensitive and accurate than the commonly used fisher or chi-square statistics. Our analysis indicates that technical differences between the cis and local-eQTL approaches, such as differences in genomic background or sex specificity, account for relatively small fraction of the discrepancy. Therefore, we suggest that the differences between two eQTL mapping approaches may facilitate sorting of SNP-eQTL interactions into true cis and trans, and that a considerable portion of local-eQTL may actually represent trans interactions.
Electronic supplementary material
The online version of this article (doi:10.1186/1471-2164-15-471) contains supplementary material, which is available to authorized users.
PMCID: PMC4089026  PMID: 24927774
Cis; Trans; eQTL; Allele Specific Expression; Adipose; RNA-seq; DNase I hypersensitivity; DBA/2J; C57BL/6J
15.  Meta-analyses of QTL for grain yield and anthesis silking interval in 18 maize populations evaluated under water-stressed and well-watered environments 
BMC Genomics  2013;14:313.
Identification of QTL with large phenotypic effects conserved across genetic backgrounds and environments is one of the prerequisites for crop improvement using marker assisted selection (MAS). The objectives of this study were to identify meta-QTL (mQTL) for grain yield (GY) and anthesis silking interval (ASI) across 18 bi-parental maize populations evaluated in the same conditions across 2-4 managed water stressed and 3-4 well watered environments.
The meta-analyses identified 68 mQTL (9 QTL specific to ASI, 15 specific to GY, and 44 for both GY and ASI). Mean phenotypic variance explained by each mQTL varied from 1.2 to 13.1% and the overall average was 6.5%. Few QTL were detected under both environmental treatments and/or multiple (>4 populations) genetic backgrounds. The number and 95% genetic and physical confidence intervals of the mQTL were highly reduced compared to the QTL identified in the original studies. Each physical interval of the mQTL consisted of 5 to 926 candidate genes.
Meta-analyses reduced the number of QTL by 68% and narrowed the confidence intervals up to 12-fold. At least the 4 mQTL (mQTL2.2, mQTL6.1, mQTL7.5 and mQTL9.2) associated with GY under both water-stressed and well-watered environments and detected up to 6 populations may be considered for fine mapping and validation to confirm effects in different genetic backgrounds and pyramid them into new drought resistant breeding lines. This is the first extensive report on meta-analysis of data from over 3100 individuals genotyped using the same SNP platform and evaluated in the same conditions across a wide range of managed water-stressed and well-watered environments.
PMCID: PMC3751468  PMID: 23663209
Breeding; Drought; Heritability; Maize; Managed water stress; Meta analysis; SNP
16.  Body composition and gene expression QTL mapping in mice reveals imprinting and interaction effects 
BMC Genetics  2013;14:103.
Shifts in body composition, such as accumulation of body fat, can be a symptom of many chronic human diseases; hence, efforts have been made to investigate the genetic mechanisms that underlie body composition. For example, a few quantitative trait loci (QTL) have been discovered using genome-wide association studies, which will eventually lead to the discovery of causal mutations that are associated with tissue traits. Although some body composition QTL have been identified in mice, limited research has been focused on the imprinting and interaction effects that are involved in these traits. Previously, we found that Myostatin genotype, reciprocal cross, and sex interacted with numerous chromosomal regions to affect growth traits.
Here, we report on the identification of muscle, adipose, and morphometric phenotypic QTL (pQTL), translation and transcription QTL (tQTL) and expression QTL (eQTL) by applying a QTL model with additive, dominance, imprinting, and interaction effects. Using an F2 population of 1000 mice derived from the Myostatin-null C57BL/6 and M16i mouse lines, six imprinted pQTL were discovered on chromosomes 6, 9, 10, 11, and 18. We also identified two IGF1 and two Atp2a2 eQTL, which could be important trans-regulatory elements. pQTL, tQTL and eQTL that interacted with Myostatin, reciprocal cross, and sex were detected as well. Combining with the additive and dominance effect, these variants accounted for a large amount of phenotypic variation in this study.
Our study indicates that both imprinting and interaction effects are important components of the genetic model of body composition traits. Furthermore, the integration of eQTL and traditional QTL mapping may help to explain more phenotypic variation than either alone, thereby uncovering more molecular details of how tissue traits are regulated.
PMCID: PMC4233306  PMID: 24165562
eQTL mapping; QTL mapping; Body composition; Myostatin; Imprinting; Interaction; Mouse
17.  Meta-eQTL: a tool set for flexible eQTL meta-analysis 
BMC Bioinformatics  2014;15(1):392.
Increasing number of eQTL (Expression Quantitative Trait Loci) datasets facilitate genetics and systems biology research. Meta-analysis tools are in need to jointly analyze datasets of same or similar issue types to improve statistical power especially in trans-eQTL mapping. Meta-analysis framework is also necessary for ChrX eQTL discovery.
We developed a novel tool, meta-eqtl, for fast eQTL meta-analysis of arbitrary sample size and arbitrary number of datasets. Further, this tool accommodates versatile modeling, eg. non-parametric model and mixed effect models. In addition, meta-eqtl readily handles calculation of chrX eQTLs.
We demonstrated and validated meta-eqtl as fast and comprehensive tool to meta-analyze multiple datasets and ChrX eQTL discovery. Meta-eqtl is a set of command line utilities written in R, with some computationally intensive parts written in C. The software runs on Linux platforms and is designed to intelligently adapt to high performance computing (HPC) cluster. We applied the novel tool to liver and adipose tissue data, and revealed eSNPs underlying diabetes GWAS loci.
PMCID: PMC4262975  PMID: 25431350
18.  Integrated genomic approaches to identification of candidate genes underlying metabolic and cardiovascular phenotypes in the spontaneously hypertensive rat 
Physiological Genomics  2011;43(21):1207-1218.
The spontaneously hypertensive rat (SHR) is a widely used rodent model of hypertension and metabolic syndrome. Previously we identified thousands of cis-regulated expression quantitative trait loci (eQTLs) across multiple tissues using a panel of rat recombinant inbred (RI) strains derived from Brown Norway and SHR progenitors. These cis-eQTLs represent potential susceptibility loci underlying physiological and pathophysiological traits manifested in SHR. We have prioritized 60 cis-eQTLs and confirmed differential expression between the parental strains by quantitative PCR in 43 (72%) of the eQTL transcripts. Quantitative trait transcript (QTT) analysis in the RI strains showed highly significant correlation between cis-eQTL transcript abundance and clinically relevant traits such as systolic blood pressure and blood glucose, with the physical location of a subset of the cis-eQTLs colocalizing with “physiological” QTLs (pQTLs) for these same traits. These colocalizing correlated cis-eQTLs (c3-eQTLs) are highly attractive as primary susceptibility loci for the colocalizing pQTLs. Furthermore, sequence analysis of the c3-eQTL genes identified single nucleotide polymorphisms (SNPs) that are predicted to affect transcription factor binding affinity, splicing and protein function. These SNPs, which potentially alter transcript abundance and stability, represent strong candidate factors underlying not just eQTL expression phenotypes, but also the correlated metabolic and physiological traits. In conclusion, by integration of genomic sequence, eQTL and QTT datasets we have identified several genes that are strong positional candidates for pathophysiological traits observed in the SHR strain. These findings provide a basis for the functional testing and ultimate elucidation of the molecular basis of these metabolic and cardiovascular phenotypes.
PMCID: PMC3217321  PMID: 21846806
expression quantitative trait locus; spontaneously hypertensive rat; quantitative trait transcript; sequence variation
19.  Identifying the genetic determinants of transcription factor activity 
Genome-wide messenger RNA expression levels are highly heritable. However, the molecular mechanisms underlying this heritability are poorly understood.The influence of trans-acting polymorphisms is often mediated by changes in the regulatory activity of one or more sequence-specific transcription factors (TFs). We use a method that exploits prior information about the DNA-binding specificity of each TF to estimate its genotype-specific regulatory activity. To this end, we perform linear regression of genotype-specific differential mRNA expression on TF-specific promoter-binding affinity.Treating inferred TF activity as a quantitative trait and mapping it across a panel of segregants from an experimental genetic cross allows us to identify trans-acting loci (‘aQTLs') whose allelic variation modulates the TF. A few of these aQTL regions contain the gene encoding the TF itself; several others contain a gene whose protein product is known to interact with the TF.Our method is strictly causal, as it only uses sequence-based features as predictors. Application to budding yeast demonstrates a dramatic increase in statistical power, compared with existing methods, to detect locus-TF associations and trans-acting loci. Our aQTL mapping strategy also succeeds in mouse.
Genetic sequence variation naturally perturbs mRNA expression levels in the cell. In recent years, analysis of parallel genotyping and expression profiling data for segregants from genetic crosses between parental strains has revealed that mRNA expression levels are highly heritable. Expression quantitative trait loci (eQTLs), whose allelic variation regulates the expression level of individual genes, have successfully been identified (Brem et al, 2002; Schadt et al, 2003). The molecular mechanisms underlying the heritability of mRNA expression are poorly understood. However, they are likely to involve mediation by transcription factors (TFs). We present a new transcription-factor-centric method that greatly increases our ability to understand what drives the genetic variation in mRNA expression (Figure 1). Our method identifies genomic loci (‘aQTLs') whose allelic variation modulates the protein-level activity of specific TFs. To map aQTLs, we integrate genotyping and expression profiling data with quantitative prior information about DNA-binding specificity of transcription factors in the form of position-specific affinity matrices (Bussemaker et al, 2007). We applied our method in two different organisms: budding yeast and mouse.
In our approach, the inferred TF activity is explicitly treated as a quantitative trait, and genetically mapped. The decrease of ‘phenotype space' from that of all genes (in the eQTL approach) to that of all TFs (in our aQTL approach) increases the statistical power to detect trans-acting loci in two distinct ways. First, as each inferred TF activity is derived from a large number of genes, it is far less noisy than mRNA levels of individual genes. Second, the number of trait/marker combinations that needs to be tested for statistical significance in parallel is roughly two orders of magnitude smaller than for eQTLs. We identified a total of 103 locus-TF associations, a more than six-fold improvement over the 17 locus-TF associations identified by several existing methods (Brem et al, 2002; Yvert et al, 2003; Lee et al, 2006; Smith and Kruglyak, 2008; Zhu et al, 2008). The total number of distinct genomic loci identified as an aQTL equals 31, which includes 11 of the 13 previously identified eQTL hotspots (Smith and Kruglyak, 2008).
To better understand the mechanisms underlying the identified genetic linkages, we examined the genes within each aQTL region. First, we found four ‘local' aQTLs, which encompass the gene encoding the TF itself. This includes the known polymorphism in the HAP1 gene (Brem et al, 2002), but also novel predictions of trans-acting polymorphisms in RFX1, STB5, and HAP4. Second, using high-throughput protein–protein interaction data, we identified putative causal genes for several aQTLs. For example, we predict that a polymorphism in the cyclin-dependent kinase CDC28 antagonistically modulates the functionally distinct cell cycle regulators Fkh1 and Fkh2. In this and other cases, our approach naturally accounts for post-translational modulation of TF activity at the protein level.
We validated our ability to predict locus-TF associations in yeast using gene expression profiles of allele replacement strains from a previous study (Smith and Kruglyak, 2008). Chromosome 15 contains an aQTL whose allelic status influences the activity of no fewer than 30 distinct TFs. This locus includes IRA2, which controls intracellular cAMP levels. We used the gene expression profile of IRA2 replacement strains to confirm that the polymorphism within IRA2 indeed modulates a subset of the TFs whose activity was predicted to link to this locus, and no other TFs.
Application of our approach to mouse data identified an aQTL modulating the activity of a specific TF in liver cells. We identified an aQTL on mouse chromosome 7 for Zscan4, a transcription factor containing four zinc finger domains and a SCAN domain. Even though we could not detect a candidate causal gene for Zscan4p because of lack of information about the mouse genome, our result demonstrates that our method also works in higher eukaryotes.
In summary, aQTL mapping has a greatly improved sensitivity to detect molecular mechanisms underlying the heritability of gene expression. The successful application of our approach to yeast and mouse data underscores the value of explicitly treating the inferred TF activity as a quantitative trait for increasing statistical power of detecting trans-acting loci. Furthermore, our method is computationally efficient, and easily applicable to any other organism whenever prior information about the DNA-binding specificity of TFs is available.
Analysis of parallel genotyping and expression profiling data has shown that mRNA expression levels are highly heritable. Currently, only a tiny fraction of this genetic variance can be mechanistically accounted for. The influence of trans-acting polymorphisms on gene expression traits is often mediated by transcription factors (TFs). We present a method that exploits prior knowledge about the in vitro DNA-binding specificity of a TF in order to map the loci (‘aQTLs') whose inheritance modulates its protein-level regulatory activity. Genome-wide regression of differential mRNA expression on predicted promoter affinity is used to estimate segregant-specific TF activity, which is subsequently mapped as a quantitative phenotype. In budding yeast, our method identifies six times as many locus-TF associations and more than twice as many trans-acting loci as all existing methods combined. Application to mouse data from an F2 intercross identified an aQTL on chromosome VII modulating the activity of Zscan4 in liver cells. Our method has greatly improved statistical power over existing methods, is mechanism based, strictly causal, computationally efficient, and generally applicable.
PMCID: PMC2964119  PMID: 20865005
gene expression; gene regulatory networks; genetic variation; quantitative trait loci; transcription factors
20.  seeQTL: a searchable database for human eQTLs 
Bioinformatics  2011;28(3):451-452.
Summary: seeQTL is a comprehensive and versatile eQTL database, including various eQTL studies and a meta-analysis of HapMap eQTL information. The database presents eQTL association results in a convenient browser, using both segmented local-association plots and genome-wide Manhattan plots.
Availability and implementation: seeQTL is freely available for non-commercial use at
Supplementary information: Supplementary data are available at Bioinformatics online.
PMCID: PMC3268245  PMID: 22171328
21.  GRASP: analysis of genotype–phenotype results from 1390 genome-wide association studies and corresponding open access database 
Bioinformatics  2014;30(12):i185-i194.
Summary: We created a deeply extracted and annotated database of genome-wide association studies (GWAS) results. GRASP v1.0 contains >6.2 million SNP-phenotype association from among 1390 GWAS studies. We re-annotated GWAS results with 16 annotation sources including some rarely compared to GWAS results (e.g. RNAediting sites, lincRNAs, PTMs).
Motivation: To create a high-quality resource to facilitate further use and interpretation of human GWAS results in order to address important scientific questions.
Results: GWAS have grown exponentially, with increases in sample sizes and markers tested, and continuing bias toward European ancestry samples. GRASP contains >100 000 phenotypes, roughly: eQTLs (71.5%), metabolite QTLs (21.2%), methylation QTLs (4.4%) and diseases, biomarkers and other traits (2.8%). cis-eQTLs, meQTLs, mQTLs and MHC region SNPs are highly enriched among significant results. After removing these categories, GRASP still contains a greater proportion of studies and results than comparable GWAS catalogs. Cardiovascular disease and related risk factors pre-dominate remaining GWAS results, followed by immunological, neurological and cancer traits. Significant results in GWAS display a highly gene-centric tendency. Sex chromosome X (OR = 0.18[0.16-0.20]) and Y (OR = 0.003[0.001-0.01]) genes are depleted for GWAS results. Gene length is correlated with GWAS results at nominal significance (P ≤ 0.05) levels. We show this gene-length correlation decays at increasingly more stringent P-value thresholds. Potential pleotropic genes and SNPs enriched for multi-phenotype association in GWAS are identified. However, we note possible population stratification at some of these loci. Finally, via re-annotation we identify compelling functional hypotheses at GWAS loci, in some cases unrealized in studies to date.
Conclusion: Pooling summary-level GWAS results and re-annotating with bioinformatics predictions and molecular features provides a good platform for new insights.
Availability: The GRASP database is available at
PMCID: PMC4072913  PMID: 24931982
22.  An eQTL Analysis of Partial Resistance to Puccinia hordei in Barley 
PLoS ONE  2010;5(1):e8598.
Genetic resistance to barley leaf rust caused by Puccinia hordei involves both R genes and quantitative trait loci. The R genes provide higher but less durable resistance than the quantitative trait loci. Consequently, exploring quantitative or partial resistance has become a favorable alternative for controlling disease. Four quantitative trait loci for partial resistance to leaf rust have been identified in the doubled haploid Steptoe (St)/Morex (Mx) mapping population. Further investigations are required to study the molecular mechanisms underpinning partial resistance and ultimately identify the causal genes.
Methodology/Principal Findings
We explored partial resistance to barley leaf rust using a genetical genomics approach. We recorded RNA transcript abundance corresponding to each probe on a 15K Agilent custom barley microarray in seedlings from St and Mx and 144 doubled haploid lines of the St/Mx population. A total of 1154 and 1037 genes were, respectively, identified as being P. hordei-responsive among the St and Mx and differentially expressed between P. hordei-infected St and Mx. Normalized ratios from 72 distant-pair hybridisations were used to map the genetic determinants of variation in transcript abundance by expression quantitative trait locus (eQTL) mapping generating 15685 eQTL from 9557 genes. Correlation analysis identified 128 genes that were correlated with resistance, of which 89 had eQTL co-locating with the phenotypic quantitative trait loci (pQTL). Transcript abundance in the parents and conservation of synteny with rice allowed us to prioritise six genes as candidates for Rphq11, the pQTL of largest effect, and highlight one, a phospholipid hydroperoxide glutathione peroxidase (HvPHGPx) for detailed analysis.
The eQTL approach yielded information that led to the identification of strong candidate genes underlying pQTL for resistance to leaf rust in barley and on the general pathogen response pathway. The dataset will facilitate a systems appraisal of this host-pathogen interaction and, potentially, for other traits measured in this population.
PMCID: PMC2798965  PMID: 20066049
23.  Statistical properties of interval mapping methods on quantitative trait loci location: impact on QTL/eQTL analyses 
BMC Genetics  2012;13:29.
Quantitative trait loci (QTL) detection on a huge amount of phenotypes, like eQTL detection on transcriptomic data, can be dramatically impaired by the statistical properties of interval mapping methods. One of these major outcomes is the high number of QTL detected at marker locations. The present study aims at identifying and specifying the sources of this bias, in particular in the case of analysis of data issued from outbred populations. Analytical developments were carried out in a backcross situation in order to specify the bias and to propose an algorithm to control it. The outbred population context was studied through simulated data sets in a wide range of situations.
The likelihood ratio test was firstly analyzed under the "one QTL" hypothesis in a backcross population. Designs of sib families were then simulated and analyzed using the QTL Map software. On the basis of the theoretical results in backcross, parameters such as the population size, the density of the genetic map, the QTL effect and the true location of the QTL, were taken into account under the "no QTL" and the "one QTL" hypotheses. A combination of two non parametric tests - the Kolmogorov-Smirnov test and the Mann-Whitney-Wilcoxon test - was used in order to identify the parameters that affected the bias and to specify how much they influenced the estimation of QTL location.
A theoretical expression of the bias of the estimated QTL location was obtained for a backcross type population. We demonstrated a common source of bias under the "no QTL" and the "one QTL" hypotheses and qualified the possible influence of several parameters. Simulation studies confirmed that the bias exists in outbred populations under both the hypotheses of "no QTL" and "one QTL" on a linkage group. The QTL location was systematically closer to marker locations than expected, particularly in the case of low QTL effect, small population size or low density of markers, i.e. designs with low power. Practical recommendations for experimental designs for QTL detection in outbred populations are given on the basis of this bias quantification. Furthermore, an original algorithm is proposed to adjust the location of a QTL, obtained with interval mapping, which co located with a marker.
Therefore, one should be attentive when one QTL is mapped at the location of one marker, especially under low power conditions.
PMCID: PMC3386024  PMID: 22520935
QTL; linkage analysis; QTL location; bias
24.  Two distinct classes of QTL determine rust resistance in sorghum 
BMC Plant Biology  2014;14:366.
Agriculture is facing enormous challenges to feed a growing population in the face of rapidly evolving pests and pathogens. The rusts, in particular, are a major pathogen of cereal crops with the potential to cause large reductions in yield. Improving stable disease resistance is an on-going major and challenging focus for many plant breeding programs, due to the rapidly evolving nature of the pathogen. Sorghum is a major summer cereal crop that is also a host for a rust pathogen Puccinia purpurea, which occurs in almost all sorghum growing areas of the world, causing direct and indirect yield losses in sorghum worldwide, however knowledge about its genetic control is still limited. In order to further investigate this issue, QTL and association mapping methods were implemented to study rust resistance in three bi-parental populations and an association mapping set of elite breeding lines in different environments.
In total, 64 significant or highly significant QTL and 21 suggestive rust resistance QTL were identified representing 55 unique genomic regions. Comparisons across populations within the current study and with rust QTL identified previously in both sorghum and maize revealed a high degree of correspondence in QTL location. Negative phenotypic correlations were observed between rust, maturity and height, indicating a trend for both early maturing and shorter genotypes to be more susceptible to rust.
The significant amount of QTL co-location across traits, in addition to the consistency in the direction of QTL allele effects, has provided evidence to support pleiotropic QTL action across rust, height, maturity and stay-green, supporting the role of carbon stress in susceptibility to rust. Classical rust resistance QTL regions that did not co-locate with height, maturity or stay-green QTL were found to be significantly enriched for the defence-related NBS-encoding gene family, in contrast to the lack of defence-related gene enrichment in multi-trait effect rust resistance QTL. The distinction of disease resistance QTL hot-spots, enriched with defence-related gene families from QTL which impact on development and partitioning, provides plant breeders with knowledge which will allow for fast-tracking varieties with both durable pathogen resistance and appropriate adaptive traits.
Electronic supplementary material
The online version of this article (doi:10.1186/s12870-014-0366-4) contains supplementary material, which is available to authorized users.
PMCID: PMC4335369  PMID: 25551674
Rust resistance; Sorghum; Pleiotropy; Height; Maturity; Stay-green; QTL mapping; Association mapping
25.  Impact of Natural Genetic Variation on Gene Expression Dynamics 
PLoS Genetics  2013;9(6):e1003514.
DNA sequence variation causes changes in gene expression, which in turn has profound effects on cellular states. These variations affect tissue development and may ultimately lead to pathological phenotypes. A genetic locus containing a sequence variation that affects gene expression is called an “expression quantitative trait locus” (eQTL). Whereas the impact of cellular context on expression levels in general is well established, a lot less is known about the cell-state specificity of eQTL. Previous studies differed with respect to how “dynamic eQTL” were defined. Here, we propose a unified framework distinguishing static, conditional and dynamic eQTL and suggest strategies for mapping these eQTL classes. Further, we introduce a new approach to simultaneously infer eQTL from different cell types. By using murine mRNA expression data from four stages of hematopoiesis and 14 related cellular traits, we demonstrate that static, conditional and dynamic eQTL, although derived from the same expression data, represent functionally distinct types of eQTL. While static eQTL affect generic cellular processes, non-static eQTL are more often involved in hematopoiesis and immune response. Our analysis revealed substantial effects of individual genetic variation on cell type-specific expression regulation. Among a total number of 3,941 eQTL we detected 2,729 static eQTL, 1,187 eQTL were conditionally active in one or several cell types, and 70 eQTL affected expression changes during cell type transitions. We also found evidence for feedback control mechanisms reverting the effect of an eQTL specifically in certain cell types. Loci correlated with hematological traits were enriched for conditional eQTL, thus, demonstrating the importance of conditional eQTL for understanding molecular mechanisms underlying physiological trait variation. The classification proposed here has the potential to streamline and unify future analysis of conditional and dynamic eQTL as well as many other kinds of QTL data.
Author Summary
Complex physiological traits are affected through subtle changes of molecular traits like gene expression in the relevant tissues, which in turn are caused by genetic variation. A genetic locus containing a sequence variation affecting gene expression is called an expression quantitative trait locus (eQTL). Understanding the tissue and cell type specificity of eQTL effects is essential for revealing the molecular mechanisms underlying disease phenotypes. However, so far the cell-state dependence of eQTL is poorly understood. In order to systematically assess the importance of cell state-specific eQTL, we propose to distinguish static, conditional and dynamic eQTL and suggest strategies for mapping these eQTL classes. We applied our framework to mouse gene expression data from four hematopoietic stages and related cellular traits. The different eQTL classes, although derived from the same expression data, represent functionally distinct types of eQTL. Importantly, conditional eQTL are well correlated with relevant hematological traits. These findings emphasize the condition specificity of many regulatory relationships, even if the conditions under study are related. This calls for due caution when transferring conclusions about regulatory mechanisms across cell types or tissues. The proposed classification will also help to unravel dynamic behaviors in many other kinds of QTL data.
PMCID: PMC3674999  PMID: 23754949

Results 1-25 (263818)