PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-25 (27)
 

Clipboard (0)
None

Select a Filter Below

Journals
more »
Year of Publication
1.  WormQTLHD—a web database for linking human disease to natural variation data in C. elegans 
Nucleic Acids Research  2013;42(D1):D794-D801.
Interactions between proteins are highly conserved across species. As a result, the molecular basis of multiple diseases affecting humans can be studied in model organisms that offer many alternative experimental opportunities. One such organism—Caenorhabditis elegans—has been used to produce much molecular quantitative genetics and systems biology data over the past decade. We present WormQTLHD (Human Disease), a database that quantitatively and systematically links expression Quantitative Trait Loci (eQTL) findings in C. elegans to gene–disease associations in man. WormQTLHD, available online at http://www.wormqtl-hd.org, is a user-friendly set of tools to reveal functionally coherent, evolutionary conserved gene networks. These can be used to predict novel gene-to-gene associations and the functions of genes underlying the disease of interest. We created a new database that links C. elegans eQTL data sets to human diseases (34 337 gene–disease associations from OMIM, DGA, GWAS Central and NHGRI GWAS Catalogue) based on overlapping sets of orthologous genes associated to phenotypes in these two species. We utilized QTL results, high-throughput molecular phenotypes, classical phenotypes and genotype data covering different developmental stages and environments from WormQTL database. All software is available as open source, built on MOLGENIS and xQTL workbench.
doi:10.1093/nar/gkt1044
PMCID: PMC3965109  PMID: 24217915
2.  Genome-wide methylation profiling identifies hypermethylated biomarkers in high-grade cervical intraepithelial neoplasia 
Epigenetics  2012;7(11):1268-1278.
Epigenetic modifications, such as aberrant DNA promoter methylation, are frequently observed in cervical cancer. Identification of hypermethylated regions allowing discrimination between normal cervical epithelium and high-grade cervical intraepithelial neoplasia (CIN2/3), or worse, may improve current cervical cancer population-based screening programs. In this study, the DNA methylome of high-grade CIN lesions was studied using genome-wide DNA methylation screening to identify potential biomarkers for early diagnosis of cervical neoplasia. Methylated DNA Immunoprecipitation (MeDIP) combined with DNA microarray was used to compare DNA methylation profiles of epithelial cells derived from high-grade CIN lesions with normal cervical epithelium. Hypermethylated differentially methylated regions (DMRs) were identified. Validation of nine selected DMRs using BSP and MSP in cervical tissue revealed methylation in 63.2–94.7% high-grade CIN and in 59.3–100% cervical carcinomas. QMSP for the two most significant high-grade CIN-specific methylation markers was conducted exploring test performance in a large series of cervical scrapings. Frequency and relative level of methylation were significantly different between normal and cancer samples. Clinical validation of both markers in cervical scrapings from patients with an abnormal cervical smear confirmed that frequency and relative level of methylation were related with increasing severity of the underlying CIN lesion and that ROC analysis was discriminative. These markers represent the COL25A1 and KATNAL2 and their observed increased methylation upon progression could intimate the regulatory role in carcinogenesis. In conclusion, our newly identified hypermethylated DMRs represent specific DNA methylation patterns in high-grade CIN lesions and are candidate biomarkers for early detection.
doi:10.4161/epi.22301
PMCID: PMC3499328  PMID: 23018867
cervical precancerous lesion; DNA methylation; MeDIP-chip; cervical scraping
3.  WormQTL—public archive and analysis web portal for natural variation data in Caenorhabditis spp 
Nucleic Acids Research  2012;41(D1):D738-D743.
Here, we present WormQTL (http://www.wormqtl.org), an easily accessible database enabling search, comparative analysis and meta-analysis of all data on variation in Caenorhabditis spp. Over the past decade, Caenorhabditis elegans has become instrumental for molecular quantitative genetics and the systems biology of natural variation. These efforts have resulted in a valuable amount of phenotypic, high-throughput molecular and genotypic data across different developmental worm stages and environments in hundreds of C. elegans strains. WormQTL provides a workbench of analysis tools for genotype–phenotype linkage and association mapping based on but not limited to R/qtl (http://www.rqtl.org). All data can be uploaded and downloaded using simple delimited text or Excel formats and are accessible via a public web user interface for biologists and R statistic and web service interfaces for bioinformaticians, based on open source MOLGENIS and xQTL workbench software. WormQTL welcomes data submissions from other worm researchers.
doi:10.1093/nar/gks1124
PMCID: PMC3531126  PMID: 23180786
4.  Quantile-Based Permutation Thresholds for Quantitative Trait Loci Hotspots 
Genetics  2012;191(4):1355-1365.
Quantitative trait loci (QTL) hotspots (genomic locations affecting many traits) are a common feature in genetical genomics studies and are biologically interesting since they may harbor critical regulators. Therefore, statistical procedures to assess the significance of hotspots are of key importance. One approach, randomly allocating observed QTL across the genomic locations separately by trait, implicitly assumes all traits are uncorrelated. Recently, an empirical test for QTL hotspots was proposed on the basis of the number of traits that exceed a predetermined LOD value, such as the standard permutation LOD threshold. The permutation null distribution of the maximum number of traits across all genomic locations preserves the correlation structure among the phenotypes, avoiding the detection of spurious hotspots due to nongenetic correlation induced by uncontrolled environmental factors and unmeasured variables. However, by considering only the number of traits above a threshold, without accounting for the magnitude of the LOD scores, relevant information is lost. In particular, biologically interesting hotspots composed of a moderate to small number of traits with strong LOD scores may be neglected as nonsignificant. In this article we propose a quantile-based permutation approach that simultaneously accounts for the number and the LOD scores of traits within the hotspots. By considering a sliding scale of mapping thresholds, our method can assess the statistical significance of both small and large hotspots. Although the proposed approach can be applied to any type of heritable high-volume “omic” data set, we restrict our attention to expression (e)QTL analysis. We assess and compare the performances of these three methods in simulations and we illustrate how our approach can effectively assess the significance of moderate and small hotspots with strong LOD scores in a yeast expression data set.
doi:10.1534/genetics.112.139451
PMCID: PMC3416013  PMID: 22661325
hotspots; permutation tests; multiple traits; LOD scores; quantitative trait loci (QTL)
5.  xQTL workbench: a scalable web environment for multi-level QTL analysis 
Bioinformatics  2012;28(7):1042-1044.
Summary: xQTL workbench is a scalable web platform for the mapping of quantitative trait loci (QTLs) at multiple levels: for example gene expression (eQTL), protein abundance (pQTL), metabolite abundance (mQTL) and phenotype (phQTL) data. Popular QTL mapping methods for model organism and human populations are accessible via the web user interface. Large calculations scale easily on to multi-core computers, clusters and Cloud. All data involved can be uploaded and queried online: markers, genotypes, microarrays, NGS, LC-MS, GC-MS, NMR, etc. When new data types come available, xQTL workbench is quickly customized using the Molgenis software generator.
Availability: xQTL workbench runs on all common platforms, including Linux, Mac OS X and Windows. An online demo system, installation guide, tutorials, software and source code are available under the LGPL3 license from http://www.xqtl.org.
Contact: m.a.swertz@rug.nl
doi:10.1093/bioinformatics/bts049
PMCID: PMC3315722  PMID: 22308096
6.  R/qtl: high-throughput multiple QTL mapping 
Bioinformatics  2010;26(23):2990-2992.
Motivation: R/qtl is free and powerful software for mapping and exploring quantitative trait loci (QTL). R/qtl provides a fully comprehensive range of methods for a wide range of experimental cross types. We recently added multiple QTL mapping (MQM) to R/qtl. MQM adds higher statistical power to detect and disentangle the effects of multiple linked and unlinked QTL compared with many other methods. MQM for R/qtl adds many new features including improved handling of missing data, analysis of 10 000 s of molecular traits, permutation for determining significance thresholds for QTL and QTL hot spots, and visualizations for cis–trans and QTL interaction effects. MQM for R/qtl is the first free and open source implementation of MQM that is multi-platform, scalable and suitable for automated procedures and large genetical genomics datasets.
Availability: R/qtl is free and open source multi-platform software for the statistical language R, and is made available under the GPLv3 license. R/qtl can be installed from http://www.rqtl.org/. R/qtl queries should be directed at the mailing list, see http://www.rqtl.org/list/.
Contact: kbroman@biostat.wisc.edu
doi:10.1093/bioinformatics/btq565
PMCID: PMC2982156  PMID: 20966004
7.  Critical reasoning on causal inference in genome-wide linkage and association studies 
Trends in genetics : TIG  2010;26(12):493-498.
Genome-wide linkage and association studies of tens of thousands of clinical and molecular traits are currently under way, offering rich data for inferring causality between traits and genetic variation. However, the inference process is based on discovering subtle patterns in the correlation between traits and is therefore challenging and could create a flood of untrustworthy causal inferences. Here we introduce the concerns and show they are valid already in simple scenarios of two traits linked or associated to the same genomic region. We argue that more comprehensive analysis and Bayesian reasoning are needed and can overcome some of these pitfalls, although not in every conceivable case. We conclude that causal inference methods may still be of use in the iterative process of mathematical modeling and biological validation.
doi:10.1016/j.tig.2010.09.002
PMCID: PMC2991400  PMID: 20951462
8.  Trans-eQTLs Reveal That Independent Genetic Variants Associated with a Complex Phenotype Converge on Intermediate Genes, with a Major Role for the HLA 
PLoS Genetics  2011;7(8):e1002197.
For many complex traits, genetic variants have been found associated. However, it is still mostly unclear through which downstream mechanism these variants cause these phenotypes. Knowledge of these intermediate steps is crucial to understand pathogenesis, while also providing leads for potential pharmacological intervention. Here we relied upon natural human genetic variation to identify effects of these variants on trans-gene expression (expression quantitative trait locus mapping, eQTL) in whole peripheral blood from 1,469 unrelated individuals. We looked at 1,167 published trait- or disease-associated SNPs and observed trans-eQTL effects on 113 different genes, of which we replicated 46 in monocytes of 1,490 different individuals and 18 in a smaller dataset that comprised subcutaneous adipose, visceral adipose, liver tissue, and muscle tissue. HLA single-nucleotide polymorphisms (SNPs) were 10-fold enriched for trans-eQTLs: 48% of the trans-acting SNPs map within the HLA, including ulcerative colitis susceptibility variants that affect plausible candidate genes AOAH and TRBV18 in trans. We identified 18 pairs of unlinked SNPs associated with the same phenotype and affecting expression of the same trans-gene (21 times more than expected, P<10−16). This was particularly pronounced for mean platelet volume (MPV): Two independent SNPs significantly affect the well-known blood coagulation genes GP9 and F13A1 but also C19orf33, SAMD14, VCL, and GNG11. Several of these SNPs have a substantially higher effect on the downstream trans-genes than on the eventual phenotypes, supporting the concept that the effects of these SNPs on expression seems to be much less multifactorial. Therefore, these trans-eQTLs could well represent some of the intermediate genes that connect genetic variants with their eventual complex phenotypic outcomes.
Author Summary
Many genetic variants have been found associated with diseases. However, for many of these genetic variants, it remains unclear how they exert their effect on the eventual phenotype. We investigated genetic variants that are known to be associated with diseases and complex phenotypes and assessed whether these variants were also associated with gene expression levels in a set of 1,469 unrelated whole blood samples. For several diseases, such as type 1 diabetes and ulcerative colitis, we observed that genetic variants affect the expression of genes, not implicated before. For complex traits, such as mean platelet volume and mean corpuscular volume, we observed that independent genetic variants on different chromosomes influence the expression of exactly the same genes. For mean platelet volume, these genes include well-known blood coagulation genes but also genes with still unknown functions. These results indicate that, by systematically correlating genetic variation with gene expression levels, it is possible to identify downstream genes, which provide important avenues for further research.
doi:10.1371/journal.pgen.1002197
PMCID: PMC3150446  PMID: 21829388
9.  DiffCoEx: a simple and sensitive method to find differentially coexpressed gene modules 
BMC Bioinformatics  2010;11:497.
Background
Large microarray datasets have enabled gene regulation to be studied through coexpression analysis. While numerous methods have been developed for identifying differentially expressed genes between two conditions, the field of differential coexpression analysis is still relatively new. More specifically, there is so far no sensitive and untargeted method to identify gene modules (also known as gene sets or clusters) that are differentially coexpressed between two conditions. Here, sensitive and untargeted means that the method should be able to construct de novo modules by grouping genes based on shared, but subtle, differential correlation patterns.
Results
We present DiffCoEx, a novel method for identifying correlation pattern changes, which builds on the commonly used Weighted Gene Coexpression Network Analysis (WGCNA) framework for coexpression analysis. We demonstrate its usefulness by identifying biologically relevant, differentially coexpressed modules in a rat cancer dataset.
Conclusions
DiffCoEx is a simple and sensitive method to identify gene coexpression differences between multiple conditions.
doi:10.1186/1471-2105-11-497
PMCID: PMC2976757  PMID: 20925918
10.  RNAi Experiments in D. melanogaster: Solutions to the Overlooked Problem of Off-Targets Shared by Independent dsRNAs 
PLoS ONE  2010;5(10):e13119.
Background
RNAi technology is widely used to downregulate specific gene products. Investigating the phenotype induced by downregulation of gene products provides essential information about the function of the specific gene of interest. When RNAi is applied in Drosophila melanogaster or Caenorhabditis elegans, often large dsRNAs are used. One of the drawbacks of RNAi technology is that unwanted gene products with sequence similarity to the gene of interest can be down regulated too. To verify the outcome of an RNAi experiment and to avoid these unwanted off-target effects, an additional non-overlapping dsRNA can be used to down-regulate the same gene. However it has never been tested whether this approach is sufficient to reduce the risk of off-targets.
Methodology
We created a novel tool to analyse the occurance of off-target effects in Drosophila and we analyzed 99 randomly chosen genes.
Principal Findings
Here we show that nearly all genes contain non-overlapping internal sequences that do show overlap in a common off-target gene.
Conclusion
Based on our in silico findings, off-target effects should not be ignored and our presented on-line tool enables the identification of two RNA interference constructs, free of overlapping off-targets, from any gene of interest.
doi:10.1371/journal.pone.0013119
PMCID: PMC2948504  PMID: 20957038
11.  Expression Quantitative Trait Loci Are Highly Sensitive to Cellular Differentiation State 
PLoS Genetics  2009;5(10):e1000692.
Genetical genomics is a strategy for mapping gene expression variation to expression quantitative trait loci (eQTLs). We performed a genetical genomics experiment in four functionally distinct but developmentally closely related hematopoietic cell populations isolated from the BXD panel of recombinant inbred mouse strains. This analysis allowed us to analyze eQTL robustness/sensitivity across different cellular differentiation states. Although we identified a large number (365) of “static” eQTLs that were consistently active in all four cell types, we found a much larger number (1,283) of “dynamic” eQTLs showing cell-type–dependence. Of these, 140, 45, 531, and 295 were preferentially active in stem, progenitor, erythroid, and myeloid cells, respectively. A detailed investigation of those dynamic eQTLs showed that in many cases the eQTL specificity was associated with expression changes in the target gene. We found no evidence for target genes that were regulated by distinct eQTLs in different cell types, suggesting that large-scale changes within functional regulatory networks are uncommon. Our results demonstrate that heritable differences in gene expression are highly sensitive to the developmental stage of the cell population under study. Therefore, future genetical genomics studies should aim at studying multiple well-defined and highly purified cell types in order to construct as comprehensive a picture of the changing functional regulatory relationships as possible.
Author Summary
Blood cell development from multipotent hematopoietic stem cells to specialized blood cells is accompanied by drastic changes in gene expression for which the triggers remain mostly unknown. Genetical genomics is an approach linking natural genetic variation to gene expression variation, thereby allowing the identification of genomic loci containing gene expression modulators (eQTLs). In this paper, we used a genetical genomics approach to analyze gene expression across four developmentally close blood cell types collected from a large number of genetically different but related mouse strains. We found that, while a significant number of eQTLs (365) had a consistent “static” regulatory effect on gene expression, an even larger number were found to be very sensitive to cell stage. As many as 1,283 eQTLs exhibited a “dynamic” behavior across cell types. By looking more closely at these dynamic eQTLs, we show that the sensitivity of eQTLs to cell stage is largely associated with gene expression changes in target genes. These results stress the importance of studying gene expression variation in well-defined cell populations. Only such studies will be able to reveal the important differences in gene regulation between different cell types.
doi:10.1371/journal.pgen.1000692
PMCID: PMC2757904  PMID: 19834560
12.  designGG: an R-package and web tool for the optimal design of genetical genomics experiments 
BMC Bioinformatics  2009;10:188.
Background
High-dimensional biomolecular profiling of genetically different individuals in one or more environmental conditions is an increasingly popular strategy for exploring the functioning of complex biological systems. The optimal design of such genetical genomics experiments in a cost-efficient and effective way is not trivial.
Results
This paper presents designGG, an R package for designing optimal genetical genomics experiments. A web implementation for designGG is available at . All software, including source code and documentation, is freely available.
Conclusion
DesignGG allows users to intelligently select and allocate individuals to experimental units and conditions such as drug treatment. The user can maximize the power and resolution of detecting genetic, environmental and interaction effects in a genome-wide or local mode by giving more weight to genome regions of special interest, such as previously detected phenotypic quantitative trait loci. This will help to achieve high power and more accurate estimates of the effects of interesting factors, and thus yield a more reliable biological interpretation of data. DesignGG is applicable to linkage analysis of experimental crosses, e.g. recombinant inbred lines, as well as to association analysis of natural populations.
doi:10.1186/1471-2105-10-188
PMCID: PMC2706229  PMID: 19538731
13.  Complex nature of SNP genotype effects on gene expression in primary human leucocytes 
Background
Genome wide association studies have been hugely successful in identifying disease risk variants, yet most variants do not lead to coding changes and how variants influence biological function is usually unknown.
Methods
We correlated gene expression and genetic variation in untouched primary leucocytes (n = 110) from individuals with celiac disease – a common condition with multiple risk variants identified. We compared our observations with an EBV-transformed HapMap B cell line dataset (n = 90), and performed a meta-analysis to increase power to detect non-tissue specific effects.
Results
In celiac peripheral blood, 2,315 SNP variants influenced gene expression at 765 different transcripts (< 250 kb from SNP, at FDR = 0.05, cis expression quantitative trait loci, eQTLs). 135 of the detected SNP-probe effects (reflecting 51 unique probes) were also detected in a HapMap B cell line published dataset, all with effects in the same allelic direction. Overall gene expression differences within the two datasets predominantly explain the limited overlap in observed cis-eQTLs. Celiac associated risk variants from two regions, containing genes IL18RAP and CCR3, showed significant cis genotype-expression correlations in the peripheral blood but not in the B cell line datasets. We identified 14 genes where a SNP affected the expression of different probes within the same gene, but in opposite allelic directions. By incorporating genetic variation in co-expression analyses, functional relationships between genes can be more significantly detected.
Conclusion
In conclusion, the complex nature of genotypic effects in human populations makes the use of a relevant tissue, large datasets, and analysis of different exons essential to enable the identification of the function for many genetic risk variants in common diseases.
doi:10.1186/1755-8794-2-1
PMCID: PMC2628677  PMID: 19128478
15.  R/parallel – speeding up bioinformatics analysis with R 
BMC Bioinformatics  2008;9:390.
Background
R is the preferred tool for statistical analysis of many bioinformaticians due in part to the increasing number of freely available analytical methods. Such methods can be quickly reused and adapted to each particular experiment. However, in experiments where large amounts of data are generated, for example using high-throughput screening devices, the processing time required to analyze data is often quite long. A solution to reduce the processing time is the use of parallel computing technologies. Because R does not support parallel computations, several tools have been developed to enable such technologies. However, these tools require multiple modications to the way R programs are usually written or run. Although these tools can finally speed up the calculations, the time, skills and additional resources required to use them are an obstacle for most bioinformaticians.
Results
We have designed and implemented an R add-on package, R/parallel, that extends R by adding user-friendly parallel computing capabilities. With R/parallel any bioinformatician can now easily automate the parallel execution of loops and benefit from the multicore processor power of today's desktop computers. Using a single and simple function, R/parallel can be integrated directly with other existing R packages. With no need to change the implemented algorithms, the processing time can be approximately reduced N-fold, N being the number of available processor cores.
Conclusion
R/parallel saves bioinformaticians time in their daily tasks of analyzing experimental data. It achieves this objective on two fronts: first, by reducing development time of parallel programs by avoiding reimplementation of existing methods and second, by reducing processing time by speeding up computations on current desktop computers. Future work is focused on extending the envelope of R/parallel by interconnecting and aggregating the power of several computers, both existing office computers and computing clusters.
doi:10.1186/1471-2105-9-390
PMCID: PMC2557021  PMID: 18808714
16.  Association testing by haplotype-sharing methods applicable to whole-genome analysis 
BMC Proceedings  2007;1(Suppl 1):S129.
We propose two new haplotype-sharing methods for identifying disease loci: the haplotype sharing statistic (HSS), which compares length of shared haplotypes between cases and controls, and the CROSS test, which tests whether a case and a control haplotype show less sharing than two random haplotypes. The significance of the HSS is determined using a variance estimate from the theory of U-statistics, whereas the significance of the CROSS test is estimated from a sequential randomization procedure. Both methods are fast and hence practical, even for whole-genome screens with high marker densities. We analyzed data sets of Problems 2 and 3 of Genetic Analysis Workshop 15 and compared HSS and CROSS to conventional association methods. Problem 2 provided a data set of 2300 single-nucleotide polymorphisms (SNPs) in a 10-Mb region of chromosome 18q, which had shown linkage evidence for rheumatoid arthritis. The CROSS test detected a significant association at approximately position 4407 kb. This was supported by single-marker association and HSS. The CROSS test outperformed them both with respect to significance level and signal-to-noise ratio. A 20-kb candidate region could be identified. Problem 3 provided a simulated 10 k SNP data set covering the whole genome. Three known candidate regions for rheumatoid arthritis were detected. Again, the CROSS test gave the most significant results. Furthermore, both the HSS and the CROSS showed better fine-mapping accuracy than straightforward haplotype association. In conclusion, haplotype sharing methods, particularly the CROSS test, show great promise for identifying disease gene loci.
PMCID: PMC2367507  PMID: 18466471
17.  Sequence Polymorphisms Cause Many False cis eQTLs 
PLoS ONE  2007;2(7):e622.
Many investigations have reported the successful mapping of quantitative trait loci (QTLs) for gene expression phenotypes (eQTLs). Local eQTLs, where expression phenotypes map to the genes themselves, are of especially great interest, because they are direct candidates for previously mapped physiological QTLs. Here we show that many mapped local eQTLs in genetical genomics experiments do not reflect actual expression differences caused by sequence polymorphisms in cis-acting factors changing mRNA levels. Instead they indicate hybridization differences caused by sequence polymorphisms in the mRNA region that is targeted by the microarray probes. Many such polymorphisms can be detected by a sensitive and novel statistical approach that takes the individual probe signals into account. Applying this approach to recent mouse and human eQTL data, we demonstrate that indeed many local eQTLs are falsely reported as “cis-acting” or “cis” and can be successfully detected and eliminated with this approach.
doi:10.1371/journal.pone.0000622
PMCID: PMC1906859  PMID: 17637838
18.  Mapping Determinants of Gene Expression Plasticity by Genetical Genomics in C. elegans 
PLoS Genetics  2006;2(12):e222.
Recent genetical genomics studies have provided intimate views on gene regulatory networks. Gene expression variations between genetically different individuals have been mapped to the causal regulatory regions, termed expression quantitative trait loci. Whether the environment-induced plastic response of gene expression also shows heritable difference has not yet been studied. Here we show that differential expression induced by temperatures of 16 °C and 24 °C has a strong genetic component in Caenorhabditis elegans recombinant inbred strains derived from a cross between strains CB4856 (Hawaii) and N2 (Bristol). No less than 59% of 308 trans-acting genes showed a significant eQTL-by-environment interaction, here termed plasticity quantitative trait loci. In contrast, only 8% of an estimated 188 cis-acting genes showed such interaction. This indicates that heritable differences in plastic responses of gene expression are largely regulated in trans. This regulation is spread over many different regulators. However, for one group of trans-genes we found prominent evidence for a common master regulator: a transband of 66 coregulated genes appeared at 24 °C. Our results suggest widespread genetic variation of differential expression responses to environmental impacts and demonstrate the potential of genetical genomics for mapping the molecular determinants of phenotypic plasticity.
Synopsis
It is widely documented that environmental changes will induce differential expression of genes, yet it is unknown how these patterns of environment-induced expression plasticity are inherited and how they differ between genetically divergent individuals of a biological species. In this paper the authors used recombinant inbred lines of the nematode worm C. elegans that were derived from parental lines originally collected in Bristol (United Kingdom) and Hawaii, and measured genome-wide gene expression at two different temperatures. Using statistical analysis tools developed for quantitative trait locus mapping, they found genes with genetically determined differences in their plastic response to temperature changes. A majority of them were found to be regulated by genes at a different genome position (regulated in trans). A striking observation was a group of 66 genes that share a common potential regulator and may be related to differences in fertility plasticity. These results show that differential responses of different genotypes to environmental changes are widespread. Because all species are subjected to environmental change, both at individual and evolutionary time scales, the authors' work calls for studying the heritable component of plasticity of gene regulation in other organisms to enhance understanding of the environmental forces that drive evolutionary adaptation.
doi:10.1371/journal.pgen.0020222
PMCID: PMC1756913  PMID: 17196041
19.  SIMAGE: simulation of DNA-microarray gene expression data 
BMC Bioinformatics  2006;7:205.
Background
Simulation of DNA-microarray data serves at least three purposes: (i) optimizing the design of an intended DNA microarray experiment, (ii) comparing existing pre-processing and processing methods for best analysis of a given DNA microarray experiment, (iii) educating students, lab-workers and other researchers by making them aware of the many factors influencing DNA microarray experiments.
Results
Our model has multiple layers of factors influencing the experiment. The relative influence of such factors can differ significantly between labs, experiments within labs, etc. Therefore, we have added a module to roughly estimate their parameters from a given data set. This guarantees that our simulated data mimics real data as closely as possible.
Conclusion
We introduce a model for the simulation of dual-dye cDNA-microarray data closely resembling real data and coin the model and its software implementation "SIMAGE" which stands for simulation of microarray gene expression data. The software is freely accessible at: .
doi:10.1186/1471-2105-7-205
PMCID: PMC1479841  PMID: 16613602
20.  Mapping of Gene Expression Reveals CYP27A1 as a Susceptibility Gene for Sporadic ALS 
PLoS ONE  2012;7(4):e35333.
Amyotrophic lateral sclerosis (ALS) is a progressive, neurodegenerative disease characterized by loss of upper and lower motor neurons. ALS is considered to be a complex trait and genome-wide association studies (GWAS) have implicated a few susceptibility loci. However, many more causal loci remain to be discovered. Since it has been shown that genetic variants associated with complex traits are more likely to be eQTLs than frequency-matched variants from GWAS platforms, we conducted a two-stage genome-wide screening for eQTLs associated with ALS. In addition, we applied an eQTL analysis to finemap association loci. Expression profiles using peripheral blood of 323 sporadic ALS patients and 413 controls were mapped to genome-wide genotyping data. Subsequently, data from a two-stage GWAS (3,568 patients and 10,163 controls) were used to prioritize eQTLs identified in the first stage (162 ALS, 207 controls). These prioritized eQTLs were carried forward to the second sample with both gene-expression and genotyping data (161 ALS, 206 controls). Replicated eQTL SNPs were then tested for association in the second-stage GWAS data to find SNPs associated with disease, that survived correction for multiple testing. We thus identified twelve cis eQTLs with nominally significant associations in the second-stage GWAS data. Eight SNP-transcript pairs of highest significance (lowest p = 1.27×10−51) withstood multiple-testing correction in the second stage and modulated CYP27A1 gene expression. Additionally, we show that C9orf72 appears to be the only gene in the 9p21.2 locus that is regulated in cis, showing the potential of this approach in identifying causative genes in association loci in ALS. This study has identified candidate genes for sporadic ALS, most notably CYP27A1. Mutations in CYP27A1 are causal to cerebrotendinous xanthomatosis which can present as a clinical mimic of ALS with progressive upper motor neuron loss, making it a plausible susceptibility gene for ALS.
doi:10.1371/journal.pone.0035333
PMCID: PMC3324559  PMID: 22509407
21.  Genome-Wide Association Study Identifies Novel Loci Associated with Circulating Phospho- and Sphingolipid Concentrations 
PLoS Genetics  2012;8(2):e1002490.
Phospho- and sphingolipids are crucial cellular and intracellular compounds. These lipids are required for active transport, a number of enzymatic processes, membrane formation, and cell signalling. Disruption of their metabolism leads to several diseases, with diverse neurological, psychiatric, and metabolic consequences. A large number of phospholipid and sphingolipid species can be detected and measured in human plasma. We conducted a meta-analysis of five European family-based genome-wide association studies (N = 4034) on plasma levels of 24 sphingomyelins (SPM), 9 ceramides (CER), 57 phosphatidylcholines (PC), 20 lysophosphatidylcholines (LPC), 27 phosphatidylethanolamines (PE), and 16 PE-based plasmalogens (PLPE), as well as their proportions in each major class. This effort yielded 25 genome-wide significant loci for phospholipids (smallest P-value = 9.88×10−204) and 10 loci for sphingolipids (smallest P-value = 3.10×10−57). After a correction for multiple comparisons (P-value<2.2×10−9), we observed four novel loci significantly associated with phospholipids (PAQR9, AGPAT1, PKD2L1, PDXDC1) and two with sphingolipids (PLD2 and APOE) explaining up to 3.1% of the variance. Further analysis of the top findings with respect to within class molar proportions uncovered three additional loci for phospholipids (PNLIPRP2, PCDH20, and ABDH3) suggesting their involvement in either fatty acid elongation/saturation processes or fatty acid specific turnover mechanisms. Among those, 14 loci (KCNH7, AGPAT1, PNLIPRP2, SYT9, FADS1-2-3, DLG2, APOA1, ELOVL2, CDK17, LIPC, PDXDC1, PLD2, LASS4, and APOE) mapped into the glycerophospholipid and 12 loci (ILKAP, ITGA9, AGPAT1, FADS1-2-3, APOA1, PCDH20, LIPC, PDXDC1, SGPP1, APOE, LASS4, and PLD2) to the sphingolipid pathways. In large meta-analyses, associations between FADS1-2-3 and carotid intima media thickness, AGPAT1 and type 2 diabetes, and APOA1 and coronary artery disease were observed. In conclusion, our study identified nine novel phospho- and sphingolipid loci, substantially increasing our knowledge of the genetic basis for these traits.
Author Summary
Phospho- and sphingolipids are integral to membrane formation and are involved in crucial cellular functions such as signalling, membrane fluidity, membrane protein trafficking, neurotransmission, and receptor trafficking. In addition to severe monogenic diseases resulting from defective phospho- and sphingolipid function and metabolism, the evidence suggests that variations in these lipid levels at the population level are involved in the determination of cardiovascular and neurologic traits and subsequent disease. We took advantage of modern laboratory methods, including microarray-based genotyping and electrospray ionization tandem mass spectrometry, to hunt for genetic variation influencing the levels of more than 350 phospho- and sphingolipid phenotypes. We identified nine novel loci, in addition to confirming a number of previously described loci. Several other genetic regions provided substantial evidence of their involvement in these traits. All of these loci are strong candidates for further research in the field of lipid biology and are likely to yield considerable insights into the complex metabolic pathways underlying circulating phospho- and sphingolipid levels. Understanding these mechanisms might help to illuminate factors leading to the development of common cardiovascular and neurological diseases and might provide molecular targets for the development of new therapies.
doi:10.1371/journal.pgen.1002490
PMCID: PMC3280968  PMID: 22359512
22.  Unraveling the Regulatory Mechanisms Underlying Tissue-Dependent Genetic Variation of Gene Expression 
PLoS Genetics  2012;8(1):e1002431.
It is known that genetic variants can affect gene expression, but it is not yet completely clear through what mechanisms genetic variation mediate this expression. We therefore compared the cis-effect of single nucleotide polymorphisms (SNPs) on gene expression between blood samples from 1,240 human subjects and four primary non-blood tissues (liver, subcutaneous, and visceral adipose tissue and skeletal muscle) from 85 subjects. We characterized four different mechanisms for 2,072 probes that show tissue-dependent genetic regulation between blood and non-blood tissues: on average 33.2% only showed cis-regulation in non-blood tissues; 14.5% of the eQTL probes were regulated by different, independent SNPs depending on the tissue of investigation. 47.9% showed a different effect size although they were regulated by the same SNPs. Surprisingly, we observed that 4.4% were regulated by the same SNP but with opposite allelic direction. We show here that SNPs that are located in transcriptional regulatory elements are enriched for tissue-dependent regulation, including SNPs at 3′ and 5′ untranslated regions (P = 1.84×10−5 and 4.7×10−4, respectively) and SNPs that are synonymous-coding (P = 9.9×10−4). SNPs that are associated with complex traits more often exert a tissue-dependent effect on gene expression (P = 2.6×10−10). Our study yields new insights into the genetic basis of tissue-dependent expression and suggests that complex trait associated genetic variants have even more complex regulatory effects than previously anticipated.
Author Summary
Gene expression can be affected by genetic variation, e.g. single nucleotide polymorphisms (SNPs). These are called expression-affecting SNPs or eSNPs. Gene expression levels are known to vary across different tissues in the same individual, despite the fact that genetic variation is the same in these tissues. We explored the different mechanisms by which genetic variants can mediate tissue-dependent gene expression. We observed that the genetic variants that associated with complex traits are more likely to affect gene expression in a tissue-dependent manner. Our results suggest that complex traits are even more complex than we had anticipated, and they underline the great importance of using expression data from tissues relevant to the disease being studied in order to further the understanding of the biology underlying the disease association.
doi:10.1371/journal.pgen.1002431
PMCID: PMC3261927  PMID: 22275870
23.  Bioinformatics tools and database resources for systems genetics analysis in mice—a short review and an evaluation of future needs 
Briefings in Bioinformatics  2011;13(2):135-142.
During a meeting of the SYSGENET working group ‘Bioinformatics’, currently available software tools and databases for systems genetics in mice were reviewed and the needs for future developments discussed. The group evaluated interoperability and performed initial feasibility studies. To aid future compatibility of software and exchange of already developed software modules, a strong recommendation was made by the group to integrate HAPPY and R/qtl analysis toolboxes, GeneNetwork and XGAP database platforms, and TIQS and xQTL processing platforms. R should be used as the principal computer language for QTL data analysis in all platforms and a ‘cloud’ should be used for software dissemination to the community. Furthermore, the working group recommended that all data models and software source code should be made visible in public repositories to allow a coordinated effort on the use of common data structures and file formats.
doi:10.1093/bib/bbr026
PMCID: PMC3294237  PMID: 22396485
QTL mapping; database; mouse; systems genetics
24.  The MOLGENIS toolkit: rapid prototyping of biosoftware at the push of a button 
BMC Bioinformatics  2010;11(Suppl 12):S12.
Background
There is a huge demand on bioinformaticians to provide their biologists with user friendly and scalable software infrastructures to capture, exchange, and exploit the unprecedented amounts of new *omics data. We here present MOLGENIS, a generic, open source, software toolkit to quickly produce the bespoke MOLecular GENetics Information Systems needed.
Methods
The MOLGENIS toolkit provides bioinformaticians with a simple language to model biological data structures and user interfaces. At the push of a button, MOLGENIS’ generator suite automatically translates these models into a feature-rich, ready-to-use web application including database, user interfaces, exchange formats, and scriptable interfaces. Each generator is a template of SQL, JAVA, R, or HTML code that would require much effort to write by hand. This ‘model-driven’ method ensures reuse of best practices and improves quality because the modeling language and generators are shared between all MOLGENIS applications, so that errors are found quickly and improvements are shared easily by a re-generation. A plug-in mechanism ensures that both the generator suite and generated product can be customized just as much as hand-written software.
Results
In recent years we have successfully evaluated the MOLGENIS toolkit for the rapid prototyping of many types of biomedical applications, including next-generation sequencing, GWAS, QTL, proteomics and biobanking. Writing 500 lines of model XML typically replaces 15,000 lines of hand-written programming code, which allows for quick adaptation if the information system is not yet to the biologist’s satisfaction. Each application generated with MOLGENIS comes with an optimized database back-end, user interfaces for biologists to manage and exploit their data, programming interfaces for bioinformaticians to script analysis tools in R, Java, SOAP, REST/JSON and RDF, a tab-delimited file format to ease upload and exchange of data, and detailed technical documentation. Existing databases can be quickly enhanced with MOLGENIS generated interfaces using the ‘ExtractModel’ procedure.
Conclusions
The MOLGENIS toolkit provides bioinformaticians with a simple model to quickly generate flexible web platforms for all possible genomic, molecular and phenotypic experiments with a richness of interfaces not provided by other tools. All the software and manuals are available free as LGPLv3 open source at http://www.molgenis.org.
doi:10.1186/1471-2105-11-S12-S12
PMCID: PMC3040526  PMID: 21210979
25.  XGAP: a uniform and extensible data model and software platform for genotype and phenotype experiments 
Genome Biology  2010;11(3):R27.
XGAP, a software platform for the integration and analysis of genotype and phenotype data.
We present an extensible software model for the genotype and phenotype community, XGAP. Readers can download a standard XGAP (http://www.xgap.org) or auto-generate a custom version using MOLGENIS with programming interfaces to R-software and web-services or user interfaces for biologists. XGAP has simple load formats for any type of genotype, epigenotype, transcript, protein, metabolite or other phenotype data. Current functionality includes tools ranging from eQTL analysis in mouse to genome-wide association studies in humans.
doi:10.1186/gb-2010-11-3-r27
PMCID: PMC2864567  PMID: 20214801

Results 1-25 (27)